Conceptual Models II, Usability Inspection Methods

From CS160 Spring 2014
Jump to: navigation, search

Readings

Finish the reading you started for Monday:

Featured Responses

Andrea Campos - 2/12/2014 2:41:34

A gulf of execution is when it is difficult for users to accomplish what they want to do with a system, and thus when they have to exercise more cognitive effort to figure out how to accomplish these goals. This can be bridged by making the controls, input descriptions and overall usage of the system conform more to the way users think, and the sorts of languages they are used to, so they can more easily do what they need. A gulf of evaluation is when it is difficult for users to understand or interpret the outputs and information of a system, and thus difficult evaluating whether their goals have been achieved. This can be bridged by matching outputs to the user's cognitive models and ways of thinking, making outputs more direct and visual, and having objects' forms suggest their meaning.

Heuristic evaluation can be more beneficial than usability testing because it results in extremely specific feedback -- "evaluators" are judging an interface against a set of tried and true principles, and so being given a vocabulary with which to asses the interface. In particular, you are getting the information you are especially interested in about a system, rather than extraneous or irrelevant commentary. Plus, once you have a list of specific heuristics that have been violated, it is much easier to see what needs to be redesigned.

I have a flight tracker app that lets you input the flights you'll take on a trip so you can see real time information about flight takeoff time, gate changes, delays, etc. It's actually really useful, but it violates a few heuristics: 1. User control and freedom: When my destination flight got cancelled but I kept the same return flight, I wanted to be able to edit just the destination flight. However, the app didn't give you that option so you had to make an entire new "trip" and input information for both the destination and return flights all over again. 2. Recognition rather than recall: You need pretty specific information on hand or in memory to track a flight, from the airline 2 character code to the airport code and the flight number. It seems it'd be easier to use if you can use destination cities to bring up choices of airports, flights, etc from which to choose from.

Matthew Deng - 2/12/2014 17:01:46

The gulf of execution is the distance between the user's intention and the instructions given to the machine. The gulf of evaluation is the distance between the output of the machine and the user's understanding of it. For example, if a user wanted to find out how much money they would have to spend to paint a room, an application that would have a narrow gulf of execution would ask the user to input the dimensions of the room, while an application with a wider gulf of execution would ask for the total surface area of the walls to be painted, which would force the user to calculate on their own given the dimensions. Going on with this example, the application would have a narrow gulf of evaluation if it outputted the price of the paint needed to paint the room, and a wider gulf of evaluation if it simply outputted the volume of paint needed, from which the user must then calculate the price. These gulfs can be bridged in multiple ways. First, we look at semantic distance, which separates the user's goals from the meanings of expression. The gulfs can be bridged from two ways: the user end and the machine end. From the user end, the user can learn to think in the same way as the machine, so that the user's inputs will match the machine, and the output will match their goal. Likewise, from the machine end, the machine can be redesigned to specifically fit the users intentions. In addition to semantic distance, there is articulatory distance, which separates the user's form of expression from the meaning of expression. The gulf of execution can be bridged by allowing the user to mimic the actions they want to do. On the other end, the gulf of evaluation can be bridged by having the machine's output to be easily understood by the user.

Heuristic evaluation can be more beneficial than usability testing not only because it saves money and time, but also because it can effectively be used early on in the development process to steer the project in the right direction. One mobile application that I have used that violates a few heuristics is the AC Next Bus application. First of all, it violates the "Error Prevention" heuristic. Every time you load the application, it asks if you want to update the stops; however, every single time it says "Error updating stops - please try again." Going on from this, it violates the "Help users recognize, diagnose, and recover from errors" heuristic in the sense that it tells you to try the "Update Stops" option again, but will fail once again. However, you can go to the main screen and press "Last Updated: HH:MM:SS" to actually update the stops. It is not very clear that this is meant to be a button, which causes the application to be someone lacking in the "Help and documentation" heuristic as well.

Andrew Fang - 2/11/2014 0:04:52

The gulfs of execution and evaluation are gaps that limit the coherency between a user and a computer system. The gulf of execution is a gap between what a user wants to do within a computer system and what commands and actions that system allows for. The gulf of evaluation is the gap between what a computer system displays as visual output and how well the user interprets that information. To close these gaps, a designer should minimize the amount that the user has to think. In order to bridge the gulf of execution, the maker of a system should design the commands and input to match the user’s goals, abilities, and though processes. When the user wants to accomplish a task, he should be able to manipulate the system in a way that is intuitive or easily learnable. In order to bridge the gulf of evaluation, the system should display its output in a format that is easily comprehendible and interpretable by the user. The interface should be clear of all distractions, and the machine’s displayed results should be clearly arranged so the user knows exactly what has happened in the system.

In a usability test, the experimenter often has to make inferences on the usability of a system based on the users’s chain of actions working with an interface. The experimenter’s judgement may not be entirely correct, and he may misinterpret the user’s intentions. On the other hand, in a heuristic evaluation, the experimenter needs only to record the user’s comments. When the report comes directly from the user’s point of view, we gain insight into the reasons behind the user’s action and we can pinpoint flaws in the interface design with greater accuracy. Because the results are coming first hand from the user, rather than being filtered by a subjective experimenter, we get direct feedback as to what is confusing and what needs improvement. In addition, a heuristic test can be performed in the early stages of product development, even before the actual product has been made. Shortly after the prototyping stages, even if a design only exists as sketches, we can use a heuristic evaluation to gauge initial remarks on the product.

I believe the Android Flipboard application violates three heuristics. First, it violates the consistency and standards heuristic. When I click on an interesting article I want to read, sometimes, the article is formatted so that I can flip through it as I flip through the various articles in the title page. However, there are some articles that, when I click on it, renders the entire webpage that the article was sitting in, and I have to zoom in to read the incredibly small text, rather than being able to flip through. Second, it violates error preventions. When I want to add new topics to my Flipboard, it will often freeze the other topics that I have loaded on my main page so that I cannot click on them or interact with them. To resolve this problem, I usually have to quit and restart the application, which is mildly annoying. Third, it violates help and documentation. There is no settings page or tutorial option to teach me how I can interact with the application. If there is, I cannot find it. I had to touch, press, and flip my way through and figure out what I can do in the app.

Reading Responses

Andrew Fang - 2/11/2014 0:04:52

The gulfs of execution and evaluation are gaps that limit the coherency between a user and a computer system. The gulf of execution is a gap between what a user wants to do within a computer system and what commands and actions that system allows for. The gulf of evaluation is the gap between what a computer system displays as visual output and how well the user interprets that information. To close these gaps, a designer should minimize the amount that the user has to think. In order to bridge the gulf of execution, the maker of a system should design the commands and input to match the user’s goals, abilities, and though processes. When the user wants to accomplish a task, he should be able to manipulate the system in a way that is intuitive or easily learnable. In order to bridge the gulf of evaluation, the system should display its output in a format that is easily comprehendible and interpretable by the user. The interface should be clear of all distractions, and the machine’s displayed results should be clearly arranged so the user knows exactly what has happened in the system.

In a usability test, the experimenter often has to make inferences on the usability of a system based on the users’s chain of actions working with an interface. The experimenter’s judgement may not be entirely correct, and he may misinterpret the user’s intentions. On the other hand, in a heuristic evaluation, the experimenter needs only to record the user’s comments. When the report comes directly from the user’s point of view, we gain insight into the reasons behind the user’s action and we can pinpoint flaws in the interface design with greater accuracy. Because the results are coming first hand from the user, rather than being filtered by a subjective experimenter, we get direct feedback as to what is confusing and what needs improvement. In addition, a heuristic test can be performed in the early stages of product development, even before the actual product has been made. Shortly after the prototyping stages, even if a design only exists as sketches, we can use a heuristic evaluation to gauge initial remarks on the product.

I believe the Android Flipboard application violates three heuristics. First, it violates the consistency and standards heuristic. When I click on an interesting article I want to read, sometimes, the article is formatted so that I can flip through it as I flip through the various articles in the title page. However, there are some articles that, when I click on it, renders the entire webpage that the article was sitting in, and I have to zoom in to read the incredibly small text, rather than being able to flip through. Second, it violates error preventions. When I want to add new topics to my Flipboard, it will often freeze the other topics that I have loaded on my main page so that I cannot click on them or interact with them. To resolve this problem, I usually have to quit and restart the application, which is mildly annoying. Third, it violates help and documentation. There is no settings page or tutorial option to teach me how I can interact with the application. If there is, I cannot find it. I had to touch, press, and flip my way through and figure out what I can do in the app.


Haley Rowland - 2/11/2014 18:05:17

The gulf of execution describes the discrepancy between the commands and mechanisms of a system and the user’s thoughts and goals. The gulf of evaluation is the gap between the system’s output and the user’s ability to perceive, interpret, and evaluate that output. To bridge the gulf of execution, the system must provide a higher-level structure to make the model world more easily translatable to the task at hand. To bridge the gulf of evaluation, the system’s output should clearly and easily show the user whether the goal has been achieved.

Because heuristic evaluation examines the interface design only, it doesn’t require functionality of the application and thus can be done with a low-fidelity prototype early in the design cycle. I decided to evaluate the UC Berkeley iPhone app. It violates the following heuristics: - Consistency and standards: the menu labels are somewhat ambiguous (“Maps & Tour”, “Library”, etc.) and users might not know where to look if they want to find directions to a library, for example. Different menus might lead to the same thing. There are also two buttons that lead to the previous screen which might confuse users as to how to actually navigate to the previous screen. - Match between system and the real world: the map section provides interaction with a Google map, which is helpful, but the names of buildings are not readily accessible or labeled. For example, instead of “Evans Hall”, the building is labeled as “Dept of Mathematics.” This is not useful for a student trying to locate a building by name using this map. - User control and freedom: pressing the “back” button on the bottom left corner brings the user back to the home screen rather than to the screen immediately preceding the current view. This makes the user re-navigate to the desired screen from the home menu, which is a hassle.


Gregory Quan - 2/11/2014 20:04:09

The gulf of execution is the gap between a user’s intention and the execution of that intention. This gulf can be bridged by the user, by the interface, or both. For example, if a user wants to add two numbers together on a computer, he or she could translate the operation into a sequence of binary numbers for the computer to execute. The user bridges the gulf by translating his intention into a language the computer can understand. Alternatively, he could open up a graphical calculator on the computer, which does the job of translating the keystrokes into machine code for him, thereby bridging the gulf.

The gulf of evaluation is the gap between the output of some action and the user’s understanding of whether his or her goal has been achieved. This gulf must also be bridged by the user, the interface, or both. For example, if a user has a digital thermometer and wants to know the day-by-day temperature change, he would have to record the temperature each day and calculate the change manually. This is an example of the user bridging the gulf. The thermometer could be equipped with some circuitry to do this task automatically and display the temperature change to the user. In this scenario, the interface would bridge the gulf of evaluation.

Heuristic evaluation can be more beneficial than usability testing because it is performed by people focused on looking for usability issues, rather than just average users. Therefore, the evaluator can discover issues that regular users might not notice during normal user testing. Also, since the evaluators are not using the system to perform a real task, it is possible to perform heuristic evaluation of user interfaces that exist on paper only and have not yet been implemented, which allows for testing of an interface early in the design lifecycle.

The Yelp mobile application violates a few interface design heuristics:

1. It does not have any help documentation, so it is hard to figure out how to, for example, change one’s profile picture.

2. When looking for Yelp deals, the app only displays deals for businesses nearby. There is no option to change the search area, which violates the heuristic of flexibility and efficiency of use.

3. One cannot remove businesses from their bookmark list intuitively. The user must click on the business, then click again on the bookmark icon to “unbookmark” the business, which is an inefficient process. Also, there is no option to sort or select multiple bookmarks for mapping. This violates the heuristic of flexibility and efficiency of use.


Jay Kong - 2/11/2014 20:36:41

The gulf of execution is the gap between the user's goal and the means to execute actions to that goal. The gulf of evaluation is how well a system provides representations that can be perceived in terms of the user's expectations. Each gulf is composed of semantic distance and articulatory distance, both of which can be shortened. To shorten semantic distance, one could make a system do more high-level things, making it less cognitively challenging for the user to complete tasks. Another way would be to train the user so that he/she becomes competent enough to think on the same level as the system. On the other hand, a way to shorten articulatory distance would be to provide an interface that permits specification of an action by mimicking it.

Heuristic evaluation is more beneficial than usability testing in a few ways. The logistics are much simpler because fewer evaluators are needed -- it is easier to schedule and a lot faster to complete. In other words, you get more bang for your buck in terms of finding issues with a user interface. In fact, heuristic evaluation is often known as "discount usability engineering" due to its efficiency. As stated in the text, an example was a benefit of $500,000 was created only with $10,500 worth of testing.

The mobile application in violation of a few heuristics is Osu! Droid, a mobile version of the popular rhythm game Osu!. It violates the follow heuristics:

1) Visibility of system status Through the application, a user is able to download "beatmaps" or songs that can be played. However, after you press download on a certain beatmap, there exists no indicator on how the download is proceeding. In fact, the user won't even know if a beatmap is downloading at all until the download is complete.

2) Error prevention Due to how the download screen is laid out, it's very easy to accidentally tap and download a beatmap while a user is scrolling. Instead of asking if a user truly wants to download a selected beatmap, the application automatically starts downloading it.

3) User control and freedom When a user accidentally downloads the wrong beatmap, which could quite easily happen as mentioned previously, there's no way to cancel the download of the beatmap. In other words, there's no "undo" functionality.


Justin MacMillin - 2/11/2014 23:24:54

1) The gulf of execution is the bridge between how clear the user interface is and what the user must come up with on their own in order to interact with the interface. For example if the user interface affords significant explanation where it leaves little for the user to figure out, there is a small bridge to gap for the gulf of execution. How easy the interface is to use is dependent on the structure that implements the interface. If the implementation is difficult to use, then it is completely on the user to cross the gulf (of execution). The gulf of evaluation is the relationship between what the user is trying to do with the program and what the program outputs. If the user has to calculate in any way the information that they need, instead of the program outputting the information for them, then the bridge to gap the gulf of evaluation is higher. One way to bridge the gulf of execution is to write the program in a higher level language, a language further away from the computer assembly. The closer the language is to the user, the easier it is to use, the less and less difficult it becomes for the user to interact with the interface. A down side to this is that the functions become more complicated in order to help the user not have to consider the complications programmers should consider. It becomes the job of the programmers to implement higher level languages using lower level ones, keeping the best abstraction strategies possible. The user could run into problems if the interface is abstracted too much, which will limit the user's control over the interface. In this case, the programmers should consider what their target audience is and determine their level of control accordingly. Another way to help clear the gulf of execution is to familiarize the users with the program before they use it. This will make the less skilled users more adapted to using the program. A down side to this would be that it fixates the users to use the program in a certain way, which may not be the most creative/efficient way. A way to bridge the gulf of evaluation is to consider how the user might be using the program. How the user interacts with the program can be an indicator of what kind of outputs the user expects. 2) Heuristic evaluation is more effective than usability testing mainly because of the fact that heuristic evaluation is more organized with focused issues. Usability testing is less organized in the way that it does not focus issues into certain overall types. The article we read does admit to the fact that not all problems from all categories can be fixed, however it still remains that organizing ideas is a great indicator that the team is unified and knows the issues at hand. Within each subgroup of problems, the team can decide which ones are most important. Soundhound is a mobile application that I believe violates 2 heuristics. This application is similar to Shazam, it listens to a song and identifies it for the user. First off, the buttons along the bottom of the main screen (where you tell the application to begin listening) are pictures without any words. While they may think that the pictures are telling of what page each of those buttons lead to, they are not even to me and I consider myself a skilled user. This problem violates the “Help and Documentation” heuristic. It is difficult for the user to know everything about all the apps features when they are not documented and properly labelled. In addition, the app violates the “Error Prevention” heuristic when it does not recognize a song. It only tells the user that it could not hear the song and to move the receiver closer to the song player. This page is displayed every time the app does not recognize a page regardless of whether it could not hear the song, if it does not know what the song is, or if it can’t decide between 2 or more songs. In my opinion, the error page should tell the user what likely went wrong.


Justin MacMillin - 2/11/2014 23:25:35

1) The gulf of execution is the bridge between how clear the user interface is and what the user must come up with on their own in order to interact with the interface. For example if the user interface affords significant explanation where it leaves little for the user to figure out, there is a small bridge to gap for the gulf of execution. How easy the interface is to use is dependent on the structure that implements the interface. If the implementation is difficult to use, then it is completely on the user to cross the gulf (of execution). The gulf of evaluation is the relationship between what the user is trying to do with the program and what the program outputs. If the user has to calculate in any way the information that they need, instead of the program outputting the information for them, then the bridge to gap the gulf of evaluation is higher. One way to bridge the gulf of execution is to write the program in a higher level language, a language further away from the computer assembly. The closer the language is to the user, the easier it is to use, the less and less difficult it becomes for the user to interact with the interface. A down side to this is that the functions become more complicated in order to help the user not have to consider the complications programmers should consider. It becomes the job of the programmers to implement higher level languages using lower level ones, keeping the best abstraction strategies possible. The user could run into problems if the interface is abstracted too much, which will limit the user's control over the interface. In this case, the programmers should consider what their target audience is and determine their level of control accordingly. Another way to help clear the gulf of execution is to familiarize the users with the program before they use it. This will make the less skilled users more adapted to using the program. A down side to this would be that it fixates the users to use the program in a certain way, which may not be the most creative/efficient way. A way to bridge the gulf of evaluation is to consider how the user might be using the program. How the user interacts with the program can be an indicator of what kind of outputs the user expects. 2) Heuristic evaluation is more effective than usability testing mainly because of the fact that heuristic evaluation is more organized with focused issues. Usability testing is less organized in the way that it does not focus issues into certain overall types. The article we read does admit to the fact that not all problems from all categories can be fixed, however it still remains that organizing ideas is a great indicator that the team is unified and knows the issues at hand. Within each subgroup of problems, the team can decide which ones are most important. Soundhound is a mobile application that I believe violates 2 heuristics. This application is similar to Shazam, it listens to a song and identifies it for the user. First off, the buttons along the bottom of the main screen (where you tell the application to begin listening) are pictures without any words. While they may think that the pictures are telling of what page each of those buttons lead to, they are not even to me and I consider myself a skilled user. This problem violates the “Help and Documentation” heuristic. It is difficult for the user to know everything about all the apps features when they are not documented and properly labelled. In addition, the app violates the “Error Prevention” heuristic when it does not recognize a song. It only tells the user that it could not hear the song and to move the receiver closer to the song player. This page is displayed every time the app does not recognize a page regardless of whether it could not hear the song, if it does not know what the song is, or if it can’t decide between 2 or more songs. In my opinion, the error page should tell the user what likely went wrong.


Luke Song - 2/11/2014 23:33:39

The gulfs of execution and evaluation describe the bridge between the user and the machine through the interface. The gulf of execution describes both the semantic distance ( the difference between the user's conception of a task and what the interface provides to the user to complete it ) and the articulatory distance ( the difference between the interface's functionality and the physical objects that get modified ). On the gulf of evaluation's side, the semantic distance describes how much the user works to understand what the interface presents as the result of an action, and the articulatory distance describes how far the representation of an object in the interface is from its real form.

One way that semantic distance can be reduced is for the user to learn how to use the interface. After a lot of experience with a particular interface, the user adapts to it and spends less energy translating their intentions to actions. Higher-level and more specialized languages also try to reduce the semantic distance in an interface. To reduce articulatory distance, it is convenient to reduce the abstraction of an object's representation from the actual object.

Heuristic evaluation allows the user to judge what is good or bad about the interface, while traditionally that was the job of the observer. In addition, heuristic users are free to ask questions about the interface, and can be used to test prototypes and underdeveloped interfaces.

I have an application on my phone that is essentially a Python IDE; I can write Python programs on it and run them, which can be useful at times. It violates a few heuristic, however. The interface is quite error-prone, simply because the keyboard auto-corrects too often and is suited for a messaging environment rather than a code development one. In addition, the application requires recall more than recognition: when I begin to type out a function or class, the keyboard doesn't suggest any completions that are relevant to the code that I'm in the middle of writing. The default screen of the application is not very minimalistic, bearing multiple options that are either redundant or won't be used often enough to justify being on the default screen.


Emon Motamedi - 2/11/2014 23:42:12

1) In using direct manipulate interfaces, users feel an impression of directness about the interface. That feeling of directness comes from two aspects: distance and direct engagement. Distance houses the gulfs of execution and evaluation, and reflects the relationship between the physical actions the system is capable of and the thoughts of the user seeking to complete a task. The weakness in this relationship is what is referred to by the aforementioned gulfs.

The gulf of execution reflects how much (or how little) of a user's thoughts and desires are addressed and alleviated through the potential actions and commands of the system. The gulf of evaluation reflects how much (or how little) the visual and graphical output of the system provide the user with clear knowledge of the model of the system that he or she can understand and evaluate. Each of these gulfs is one-directional, with the gulf of execution reflecting the relationship of the user's goals to the capabilities of the system and the gulf of evaluation reflecting the relationship of the visual output of the system to the understanding of the user.

In order to bridge the gulf of execution, designers must do a lot of research on users to glean thorough knowledge of their needs and desires of the system. They must completely understand their goals in order to build a system that accomplishes said goals. Then, designers must test the system on users to ensure that all of their desired tasks are being accomplished and that there are no additional desires the users have. To bridge the gulf of evaluation, designers must create many prototypes of what the visual output of the system will look like for the users. They must constantly test these prototypes to ensure that users see what they are supposed to see, understand what they are seeing, and can take make judgements accordingly. These prototypes must constantly be revamped as the prototypes become higher and higher in fidelity and the final version of the system must also be tested on users to ensure that the gulf of evaluation is bridged as much as possible.

2) Usability testing forces the observer to interpret the actions of the user, an interpretation that can often times be incorrect or obtuse. On the other hand, an observer in heuristic evaluations only has to read the notes of evaluators and does not need to worry about (potentially incorrectly) interpreting anything. Secondly, a heuristic evaluation uses time more efficiently. Because the observer is allowed to guide the evaluator when he or she reaches an issue, these issues can be surpassed more quickly and time will be saved. In usability testing, the observer does not guide users and therefore users can get stuck on one problem for a very long period of time. Finally, and similarly, because observers are there to answer questions and guide evaluators, evaluators can get a better sense of the usability of the interface in terms of features of the specific domain they are evaluating.

Mobile application that violates heuristics: TripAdvisor City Guides

1. Flexibility and efficiency of use - I perform the same actions on the app frequently, such as checking for a city's top excursions and restaurants. However, in order to accomplish these actions, I have to click through 5 or 6 different pages, pages that include a selection of the city I am in. If the application could provide an accelerator that helps me accomplish these frequent tasks more quickly, it would be a much more efficient application.

2. Aesthetic and minimalist design - In showing me a restaurant or a monument, the application provides much more information than is necessary or required. This information includes history of the establishment, photos, address, and reviews. Rather than showing me all of this information at once, the application should allow me to click into the options I wish to see to reduce clutter.

3. Error prevention - Certain commands often lead the application to crash. I would avoid these commands if I knew the application would crash, so the application should provide me with a confirmation option before I commit an action that is known to lead to crashes. Taking this one step further, the application should eliminate these error conditions to begin with.


Ziran Shang - 2/11/2014 23:45:41

The gulf of execution is the difference between something the user wants to do, and the means of accomplishing that thing. The gulf of evaluation is how much work the user has to do in order to see whether the goal has been accomplished. The gulf of execution can be bridged by forming an intention that specifies the meaning of input that will accomplish the user's goal, and forming an action specification that describes how to make input that has that meaning. The gulf of evaluation can be bridged by interpretation, which derives the meaning of an output, and evaluation, which looks at the meaning of the output in relation to the user's goal.

Heuristic evaluation is more efficient and cheaper than usability testing. Also, because heuristic evaluation looks simply at a set of guidelines, testers don't need to have background knowledge about the application. On the other hand, usability testing also looks at testers' actions, so people who don't know how to use an app can skew results of the usability test.

An app that violates two heuristics is the Phone app in iOS. The first heuristic that it violates is error prevention. If you don't have service, or have a really weak signal, you can still dial a number without any warning from the app itself. You'll only find out if the call doesn't go through, or if the other person can't hear you. The second heuristic the app violates is helping users recognize, diagnose and recover from errors. The only error message the app really gives the user is "call failed". However, it does not tell them why, or whether it is a problem on the user's end or the other person's end.


Andrew Chen - 2/12/2014 0:28:12

1. The gulf of execution refers to the potential disconnect between the commands and mechanisms of the system and the thoughts and goals of the user. The gulf of evaluation is the potential difference between what the output displays and the conceptual model of the system that is readily perceived by the user. Bridging these gulfs require bridging the gulfs’ respective semantic and articulatory distances. One way to decrease semantic distance is to let the task be described by the same language as the task domain, such as in a higher level language. One way to decrease articulatory distance is to provide an interface that permits action specification through mimicking it. 2. One way in which heuristic evaluation is more beneficial than usability testing is that since evaluators are not actually using the system, heuristic evaluation is suitable for use early in the usability engineering lifecycle. In addition, heuristic evaluation is also a very cost-efficient method of usability engineering. The mobile application I chose that I think violates at least 2 heuristics is Google Sky Map. a. Help and documentation: It doesn’t have a clear and concise guide on how to use to read the constellation map. The help view is too verbose. b. Visibility of system status: The user doesn’t know if he is orienting the phone correctly, or whether he is getting an accurate reading of the stars.


Sijia Li - 2/12/2014 0:43:00

1. In short, "the gulf of execution" can be thought of the "executability of the system towards a particular task; it is the gap between the user's task domain and the system's interface settings. "The gulf of execution" is related to the "input interface language" (p. 321). On the other side, "the gulf of evaluation" corresponds to "the output interface language"(p. 321); it can be thought of the gap between the terms of the output and user's intention. Moreover, "the gulf of evaluation" implies the amount of efforts that user needs to pay to determine whether the goal has been achieved. It is important to note that each gulf is "unidirectional" (p. 319). The gulf of execution goes from goals to system state; the gulf of evaluation goes from system state to goals.

"The gulf of execution is bridged by making the commands and mechanisms of the system match the thoughts and goals of the user" (p. 318). "The gulf of evaluation is bridged by making the output displays present a good conceptual model of the system that is readily perceived, interpreted, and evaluated" (p. 318).


2. Heuristic evaluation can be more beneficial than traditional usability testing mainly in the following aspects:

(a) Heuristic evaluation ensures the "independence" of each evaluation. "Heuristic evaluation is performed by having each individual evaluator inspect the interface alone" ("How to conduct a heuristic evaluation", Nielsen).

(b) Heuristic evaluation grants the willingness of the observer to answer questions from the evaluators during the session. However, in traditional user testing, one normally wants to discover the mistakes users make when using the interface; the experimenters are therefore "reluctant to provide more help than absolutely necessary".

(c) Heuristic evaluation is able to afford the extent to which the evaluators can be provided with hints on using the interface. On the contrary, in traditional user testing, users are requested to discover the answers to their questions by using the system rather than by having them answered by the experimenter.


I used a laundry card payment transaction mobile app, which violates at least 2 of the heuristics:

First, it does not offer "Error Prevention" or "Help users recognize, diagnose, and recover from errors". There were at least 3 times , when I tried to charge my laundry card by my credit card via that app, I did not get the money on my laundry card and my credit card was still charged! And, the worst part is that there is NO error message telling me what might have gone wrong! Not to mention "error prevention". And, the app does not have any instruction about how to contact the company if there is any technical issues.

Second, the app violates the "User control and freedom" heuristics. Its buttons are pretty close to each other. It is pretty easy to press the wrong button. And, once you pressed the wrong button, if is hard to go back. In some situations, you have to manually exit the app in a "brutal force" way by killing the app from the list of tasks and re-start it.


Sijia Li - 2/12/2014 0:44:27

1. In short, "the gulf of execution" can be thought of the "executability of the system towards a particular task; it is the gap between the user's task domain and the system's interface settings. "The gulf of execution" is related to the "input interface language" (p. 321). On the other side, "the gulf of evaluation" corresponds to "the output interface language"(p. 321); it can be thought of the gap between the terms of the output and user's intention. Moreover, "the gulf of evaluation" implies the amount of efforts that user needs to pay to determine whether the goal has been achieved. It is important to note that each gulf is "unidirectional" (p. 319). The gulf of execution goes from goals to system state; the gulf of evaluation goes from system state to goals.

"The gulf of execution is bridged by making the commands and mechanisms of the system match the thoughts and goals of the user" (p. 318). "The gulf of evaluation is bridged by making the output displays present a good conceptual model of the system that is readily perceived, interpreted, and evaluated" (p. 318).


2. Heuristic evaluation can be more beneficial than traditional usability testing mainly in the following aspects:

(a) Heuristic evaluation ensures the "independence" of each evaluation. "Heuristic evaluation is performed by having each individual evaluator inspect the interface alone" ("How to conduct a heuristic evaluation", Nielsen).

(b) Heuristic evaluation grants the willingness of the observer to answer questions from the evaluators during the session. However, in traditional user testing, one normally wants to discover the mistakes users make when using the interface; the experimenters are therefore "reluctant to provide more help than absolutely necessary".

(c) Heuristic evaluation is able to afford the extent to which the evaluators can be provided with hints on using the interface. On the contrary, in traditional user testing, users are requested to discover the answers to their questions by using the system rather than by having them answered by the experimenter.


I used a laundry card payment transaction mobile app, which violates at least 2 of the heuristics:

First, it does not offer "Error Prevention" or "Help users recognize, diagnose, and recover from errors". There were at least 3 times , when I tried to charge my laundry card by my credit card via that app, I did not get the money on my laundry card and my credit card was still charged! And, the worst part is that there is NO error message telling me what might have gone wrong! Not to mention "error prevention". And, the app does not have any instruction about how to contact the company if there is any technical issues.

Second, the app violates the "User control and freedom" heuristics. Its buttons are pretty close to each other. It is pretty easy to press the wrong button. And, once you pressed the wrong button, if is hard to go back. In some situations, you have to manually exit the app in a "brutal force" way by killing the app from the list of tasks and re-start it.


Sergio Macias - 2/12/2014 1:02:37

1) The gulf of execution is the gap between the command of the user to perform an action and the actual mechanism which performs said action. One way to bridge this gulf it to make more explicit or more obvious (perhaps through direct interface) how to correctly perform specific actions; i.e. if you want to add 1 to the score of a team in a Score-Keeper app, press the “+” button under that teams name. The gulf on the evaluation is the mental gap must one must cross to analyze the output given and see if it’s the output that was wanted/expected; i.e. after pushing the “+” button, meaning to raise the score of team 1 and team 2’s score goes up by 1, they will know that they performed the wrong action and thus undo that previous action.

2) Heuristic evaluation could possibly be more beneficial than usability testing, depending on the UI being developed and in what stage of testing for the UI they are in. In the early stages of testing, I feel heuristic evaluation would be good because it would quick and would not take long to set up, since it would take much longer to gather a large group of relevant test users. However later in the design stage, I feel that usability testing would be a necessity to see where the average user is going wrong and perhaps what assumption about the user base is wrong.

The mobile application I chose was CamScanner, which takes a photo of pictures and converts them to pdfs. One of the heuristics it violates is that when you first launch the app, it automatically launches to a picture with a message and you have to go across a few these before reaching the “menu” button to get to the actual app menu. I was initially confused because I didn’t know you had to swipe to the side to the menu screen because there was no visibility of system status and what was currently going on. Another heuristic it violates is that it doesn’t match the system with the real world in that the icons representing the respective actions do not seem to match up. For example the way to share your pdfs is not through a person icon or a tree diagram (one node going to two others) but instead it’s a phone icon – I’m assuming they are using that icon to represent sending the pdf from one from to the next but I had no idea what that button did until I pressed it and even then it was hard to escape that menu, which in turn violates another heuristic, user control and freedom.


Myra Haqqi - 2/12/2014 2:03:13

1) An interface to a system causes distances which ultimately correspond to gulfs between a user’s goals and knowledge and the extent to which the system provides information to the user. Gulfs of execution and evaluation are unidirectional.

Gulf of execution describes the relation between the user’s intents and the organization of instructions given to the system. The direction of the gulf of execution involves going from goals to the physical system state. Gulf of execution is associated with semantic directness, which demonstrates the need to sync to the level of the user and how he thinks about some task. In order to span the gulf, the user must produce an information-processing structure. In relation to the gulf of execution, the semantic distance depicts the extent to which the required structure is provided by the system, as well as how much of the structure is provided by the user of the system. When a user must provide more structure, then there is consequently a larger distance that must be bridged.

Gulf of evaluation, on the contrary, goes from physical system state to goals. In terms of gulf of evaluation, semantic distance refers to how much processing structure the user needs in order to successfully discern when his goal has been accomplished. The gulf of evaluation describes the relation between intentions of the user and the output language of the system.

Articulatory distance exists in the gulfs as well, depending on the input and output. It involes providing an interface that permits specification of action by emulating it.

There are a myriad different ways to bridge these gulfs. In order to bridge the gulf of execution, one must make commands and mechanisms of the system in order to match the thinking process as well as the tasks that the user hopes to achieve. One way to bridge the gulf between the intentions of the user and the specifications needed by a computer is to provide the user with higher-level languages as opposed to low-level languages. This requires that the machine do more work in order to translate, which is beneficial in that it provides more ease to the user. One consequence of this, however, is that the language loses generality. Furthermore, when the interface approaches the end of the gulf corresponding to the user’s intention, then functions become complex and specialized.

Another solution is the write higher-level functions in terms of lower-level functions, which allows the user to more easily interact with the system when the functions match his intents. However, this restricts possibilities, expands vocabulary, and requires that the user have a solid foundation in the language interaction as opposed to the domain of the tasks involved.

One can make the output show semantic concepts directly as well, while not including general computing or output. Rather, one can develop systems for specialized functions in order to serve specific domains and areas of interest. However, this also leads to a loss of generality.

In general, the gulf between a user’s intentions and the interface must ultimately be bridged by the user, due to the fact that automated behavior by repetition does not decrease the semantic distance.

In order to span the gulf, the user can evolve his thinking in order to become accustomed to the representation of the system. The gulf is therefore bridged by moving the user closer to the system. The user might change his conceptual model to match the interface language, but the cost is that the user has to learn to think in a novel way. However, there is an advantage in power in thinking about a domain in a new way.

In summary, forming intention spans the semantic distance in the gulf of execution, and forming action spans the articulatory distance in the gulf of execution.

In order to bridge the gulf of evaluation, one can improve the output display such that it is easy for the user to understand the conceptual model of the system. The goal is to ultimately minimize the effort of the user to prevent the user from having to exercise his thoughts to a lofty extent. If an interface helps bridge gulfs, then it requires less cognitive work, and leads to a greater sense of directness


2) Heuristic evaluation can be more beneficial than usability testing because the observer, or experimenter, need not analyze the evaluators of the heuristic evaluation as they have to interpret the actions of the user in usability testing. Also, evaluators analyze usability problems in terms of recognized usability principles, or heuristics, which describe characteristics of well-designed interfaces. This is beneficial in that it gives evaluators a focus of seeking information in order to improve an interface, as opposed to users who merely interact with the interface. There is also a very good cost-benefit for the heuristic evaluation process, such that after accounting for the cost of the heuristic evaluation, the ultimate benefits far outweigh any costs incurred, fixed or variable.

A mobile application that violates at least 2-3 heuristics is the United Airlines mobile application.

One heuristic that the application violates is the match between the system and the real world. The United Airlines application includes icons that I do not understand the meaning of. This heuristic emphasizes the necessity to include the user’s “language” to describe concepts familiar to the user. It should follow conventions utilized in actuality, and display information in a natural and logical order. This application violates this heuristic because the icons do not match the real world. For example, the icon of a chair provides no matching to any concepts I associate with United Airlines. However, apparently by clicking the icon of the chair, I am able to access the United Club card. There is also an icon of a banner, which resembles an award to me. Upon clicking this icon, I discovered that it refers to my MileagePlus card. There is not a clear mapping between the system and the user’s conceptual models about United Airlines, making it difficult to discern what the icons in the application mean.

Another heuristic that the application violates is recognition rather than recall. This guiding principle is meant to minimize the effort of the user in memorizing information and lighten the user’s memory load. However, in order to check in to my flight on the United mobile application, I must recall my Confirmation number, Ticket number, or MileagePlus number, none of which I have memorized. This violation causes the user to need to memorize or take time to find the information required, without providing us easy access to that information.

Furthermore, another heuristic that the application violates is help and documentation, because I am not able to easily find any sort of help functionality. The purpose of this heuristic is to allow the user to easily obtain necessary information when needed. If a user has a question or misinterprets part of the United Airlines application, however, he has no way to seek information to help him understand the aspects of the system.


Seth Anderson - 2/12/2014 2:39:43

1) A gulf of execution is the gap of user input with the actual action taken in the machine: for example, the delay between when a user flips a switch and a lightbulb actually turns on. The delay in execution is how long it takes for the user to perceive the action has occurred: in that example, the delay in the user noticing the light is on. One way to bridge the gap in execution is to optimize computing processing speeds by adding more cores or processing power. A way to improve the perception gulf would be to add more signifiers identifying just what is going on behind the screen.

2)Evaluating heuristics is a good way to identify if a product will succeed because you are testing each specific aspect of a product, leading to a more detailed overview, rather than usability testing which is far more broad and does not dig deeper into individual parts.

One app that violates 2 major heuristics is the Facebook app on iOS. One violation can be found against minimalist design. Though aesthetically pleasing, the app is riddled with pop up menus and icons that make it difficult to navigate, and the Messenger icon takes the user to a whole separate app, creating a major inconvenience and complication. Another heuristic it violates is flexibility for expert users: there are no options to tailor the app optimally to expert users, all users are treated exactly the same.


Andrea Campos - 2/12/2014 2:41:34

A gulf of execution is when it is difficult for users to accomplish what they want to do with a system, and thus when they have to exercise more cognitive effort to figure out how to accomplish these goals. This can be bridged by making the controls, input descriptions and overall usage of the system conform more to the way users think, and the sorts of languages they are used to, so they can more easily do what they need. A gulf of evaluation is when it is difficult for users to understand or interpret the outputs and information of a system, and thus difficult evaluating whether their goals have been achieved. This can be bridged by matching outputs to the user's cognitive models and ways of thinking, making outputs more direct and visual, and having objects' forms suggest their meaning.

Heuristic evaluation can be more beneficial than usability testing because it results in extremely specific feedback -- "evaluators" are judging an interface against a set of tried and true principles, and so being given a vocabulary with which to asses the interface. In particular, you are getting the information you are especially interested in about a system, rather than extraneous or irrelevant commentary. Plus, once you have a list of specific heuristics that have been violated, it is much easier to see what needs to be redesigned.

I have a flight tracker app that lets you input the flights you'll take on a trip so you can see real time information about flight takeoff time, gate changes, delays, etc. It's actually really useful, but it violates a few heuristics: 1. User control and freedom: When my destination flight got cancelled but I kept the same return flight, I wanted to be able to edit just the destination flight. However, the app didn't give you that option so you had to make an entire new "trip" and input information for both the destination and return flights all over again. 2. Recognition rather than recall: You need pretty specific information on hand or in memory to track a flight, from the airline 2 character code to the airport code and the flight number. It seems it'd be easier to use if you can use destination cities to bring up choices of airports, flights, etc from which to choose from.


Michelle Nguyen - 2/12/2014 3:02:15

1) The gulf of execution is what the user has to do in order to perform their task, given what is already provided by the system. This gulf deals with the input of the interface. Meanwhile, the gulf of evaluation corresponds to the output of the interface. It is what the user must do in order to determine whether or not their task has been fulfilled given the system's output. The first way to bridge the gulf is to use a higher-level language. For instance, the system designer can target the tasks the users typically need and make it so the system does more of the work in executing the task. This way, the user doesn't have to do some complicated lower-level operations. A good example of this is in computer science. Declaring a variable in Python is simple--the user doesn't have to state the type of the variable and the system figures it out for the user. Meanwhile, C is a lower-level language, where the user must bridge the gap by declaring the variable types on their own. A way to bridge the gulf of evaluation is to provide an output display that directly corresponds to the user's task. On the other hand, users can also make an effort to bridge the gulf. To bridge the gulf of execution, the users can change how they think of their task to think of it in the same way as the system, matching how the it represents those tasks.

2) A benefit of heuristic evaluation is that the evaluator can directly state any comments and confusions they had while using the interface. Meanwhile, in user testing, the experimenter is the one who must infer these thoughts from just observing the user. Clearly, the experimenter has a chance of interpreting the user's actions incorrectly, which may lead the design to the wrong direction. With heuristic evaluation, there are no doubts about the evaluator's exact thoughts. Also, the experimenter/observer is able to answer any questions the evaluator may have. This saves time, allowing the observer to see more of how the evaluator uses the interface, rather than being stuck on a single detail. Answering questions will also allow the evaluator to assess the interface better, since they actually have a clear image of how they are supposed to use it.

One mobile application that violates two heuristics is one I just downloaded called Themer for Android, which allows you to apply new creative, themes to your phone's launcher. The heuristics it violated are as follows: Recognition rather than recall: The first thing a user must do when they install the application is set the app defaults/preferences for a long list of categories (phone, SMS, contacts, camera, internet, etc), so that the icons on the homescreen point to the applications that the user actually uses. Upon clicking the default you want to set, it just gives the list of all the user's apps. Especially since I had to set so many defaults, I found myself forgetting which app default I was setting and had to exit from the list to check and refresh my memory. After, I had to click back again to go back to the list. The app needs an indicator of what default you're trying to set (for example, "Choose the app you prefer for Messaging").

Consistency and standards: Although each theme in Themer may be very different, they still allow you to place your apps on the homescreen just like the original Android homescreen. With the original Android launcher, and many other custom ones I have downloaded, it has been common that I can uninstall my applications by doing a long hold on the app I want to delete on my homescreen--which is much easier than going to the application manager from the settings. I was very confused when I could not uninstall applications the same way from this launcher, and thought it should have followed the platform standards.


Jimmy Bao - 2/12/2014 3:54:57

1) The gulf of execution is basically the gap between a user's intention and a machine's execution instructions. One can also think of it was the gulf between the user's intention and all the low-level bit shuffling operations. This gap must be filled with the "user's extensive planning and translation activities" regardless of whether or not it is a complex or a simple machine. On the other hand, the gulf of evaluation is the amount of processing that is necessary for a user to determine whether or not the goal has been reached.

As mentioned in the reading, providing the user with a higher-level language that directly deals with problem decomposition and what needs to be done is one way to bridge the gulfs. Other ways include making the users very familiar with the system and to move the users closer to the system so that the gap between them and the machine's execution instructions aren't so far. For example, CS students start off taking CS61A when we first learn about all the higher-level stuff and remain oblivious of all the low-level stuff that goes on when we're executing all these high-level tasks. Eventually, we take 61C and we begin to understand what happens under the hood.

2) Heuristic evaluation can be more beneficial than usability testing because the observers generally have to understand and evaluate all of the evaluators' comments about what's wrong with the UI. And since different people with different experiences find different issues, it definitely helps the observers gather even more data to understand what should be fixed so that user experience would be smoother. Moreover, I think it's more beneficial that several different thought processes are taken into account in heuristic evaluation as opposed to just a couple of the observer's thought processes which could potentially lead to just one-sided thoughts.

The Ubersync for Facebook Android application (it pulls your contacts' Facebook profile picture and set it as their contacts photo) violates the following heuristics:

- Visibility of system status: it doesn't really tell the user how far along the syncing is when sync has been initiated. It also doesn't tell the user whether the sync was successful or failed. - User control and freedom: there's only two real options: run a sync to update the pictures (pulled from their Facebook account) for those you've previously included when running the sync, or run a full sync where all images are removed and the sync is started from the very beginning. The user doesn't have the freedom to sync an individual contact. For example, if I added a new contact to my contacts list, I can't just individually sync their picture. I have to run the full sync to include this person which is a real hassle, especially for those who have a lot of contacts because that will take a long time - Help users recognize, diagnose, and recover from errors: There pretty much aren't any error or help messages. The user can only assume that if the pictures are there, that the sync was successful. Otherwise, the sync wasn't successful. There aren't error messages like (connection error or something of the sort) that tells the user why the sync failed. - Help and documentation: Again, they lack all of this stuff because I guess it was meant to be a simple app. It is relatively simple ONLY when it's working correctly.


Shaina Krevat - 2/12/2014 7:48:17

1. Gulfs of execution are the differences between what the user thinks they can do in a program and what the program can actually do. Gulfs of evaluation are the differences between how the user interprets what the interface can do and what the interface can actually do. Good signifiers, feedback and mapping will bridge these gaps, in that signifiers will show what the product is capable of (execution), mapping will properly show how the interface of the product corresponds to what it does (evaluation), and feedback will alert the user to how their actions affect the system (evaluation).

2. Heuristic evaluation can be more beneficial than usability testing, because heuristic evaluation tests the way that users interpret and understand the design of the product, leading to them being able to figure out how to use the product, while usability testing only tests to see if a product works how it is intended. Apple Maps is an application that violates: Visibility of System Status (often the application will just freeze when there is an error. It is impossible to tell if the system is loading and will eventually come up with a solution, or is just stuck and unable to get GPS data) and Helps Users Recognize, Diagnose, and Recover From Errors (often it is difficult to see if there is an error, as noted above, and if that is the case, there is rarely any advice on how to fix the error. The user is completely in the dark about what went wrong).


Jeffrey DeFond - 2/12/2014 8:34:18

1). The gulf of execution is the gap between a user's intended task and what is actually going on behinds the scenes of a device. A way to bridge this gulf would be to provide a more structured process for users in ordered to save them the work of translating. The gulf of evaluation is the amount of work a user must do to make sense of a devices output. A good way to bridge this game would be to make put the output in the terms that a user would desire, lessening their work.

2). A heuristic evaluation can be more beneficial then usability testing because in heuristic evaluation each problem has a reference to which heuristic it violated. This can allow for better understanding (in aggregate) what is wrong with your system, as well as hints on how to fix it. I have a mobile app, DocScanner, which I use quite a bit for created pdfs with my phones camera. The app is useful, however it violates several of the heuristics. First of all it violates the heuristic of Consistency and Standards, when I scan a sheet of paper, the system will assign it a category, doc, letter, receipt and misc. I have been using this app for almost two years and I still have no clue how it does this, to me it appears that it randomly pics one and assigns the pdf to that folder. Along this same vein the app also violates the visibility of system status heuristic, particularly when it comes to uploading the pdfs to Dropbox. I usually have to manually check my Dropbox folder to see if the upload has completed.


Tien Chang - 2/12/2014 9:29:32

1) Gulf of executions is the distance between how much a user must do on an interface for desired purposes and how much an interface does to produce a result. This includes forming an intention in an activity and forming an action specification. Gulf of evaluation is the distance between what a user expects to see after performing an action and what the interface actually outputs or displays. This includes interpretation and evaluation. Different ways to bridge these gulfs is to require more effort from the system designer on the system side and/or to require more effort from the user on the user side. The system designer could construct higher-level languages that provide consistency across the interface, such as Lisp and UNIX. Similarly, the user can do more research to understand the system, adapt from experience, and develop competence.

2) Heuristic evaluation can be more beneficial than usability testing when evaluators have questions about an application that requires expertise. This allows evaluators to better test the application's usability. A mobile application 1) that is in another language 2) in which actions cannot be undone 3) that does not contain any help documentation is an application that horribly violates some of the most common heuristics. 1) An application that is in a different language that you cannot understand would have a rating of 4 on the severity scale. If a user cannot match the system between his or her real world, it would be difficult for the user to utilize this application. 2) An application in which actions cannot be undone would have a 2 or 3 severity on the application because this affects all actions frequently, the impact may be great, and the persistence is throughout the use of the application. This disallows user control and freedom. 3) An application that does not contain any help documentation would have a severity level of 2 to 4. This problem could potentially be passable, assuming the application is easy for the user to understand without requiring help. However, if the application requires complex use, then this violation would have a degrading market impact.


Andrew Lee - 2/12/2014 9:43:04

1. The gulfs of execution and evaluation are the hurdles and processing that the user needs to overcome to use a system. The gulf of execution covers the direction of taking the user's intention, and how to express it to the interface, while the gulf of evaluation covers the direction of taking the system's output, and figuring out its meaning and how well it satisfies the user's intention.

Some ways to bridge these gulfs are: - Providing tools to the user at the level of granularity that matches the domain of tasks that can be performed. - Have the input and output languages be inter-referential, so that both directions of the interaction can be expressed in the same way. - Make the representation of entities resemble real-world counterparts as much as possible.

2. Heuristic evaluations can be more beneficial than usability testing because it's usually cheaper, more focused/structured, and can still be pretty effective in making the product more usable. Thanks to the lower cost in time and energy, it can be better incorporated in a rapid iteration cycle.

A mobile application that violates several heuristics is Keygenjukebox, which is a media player that plays keygen chiptune songs. Some of the heuristics that it violates are: - Visibility of system status: Most media players on Android have a persistent notification when they are playing, but this one has no notification at all when a song is playing. - User control and freedom: Most media players let the user seek to any part of the currently playing song, but this one doesn't. - Consistency and standards: I think for most media players, when the song is playing, the play/pause button displays the pause icon and vice versa, signifying the action that the player will take when that button is pressed. On this one, it's reversed, effectively signifying the current state instead of what it will be. - Flexibility and efficiency of use: On this media player, playlists aren't as easy to use as those on other media players. Additionally, the user has no choice in the available song collection (i.e., no way to supply their own music), granted this app was not necessarily designed to.


Allison Leong - 2/12/2014 10:12:20

1. The gulf of execution is the distance between the high level description of a user’s intention and the low level tasks required to accomplish the intention. If an interface language is at the level of primitive operations, then the gulf of execution is wide and this gulf must be filled with extensive planning by the user in order for the user to figure out how to use the machine to accomplish the task. The gulf of evaluation refers to the amount of "processing structure that is required for the user to determine whether a goal has been achieved”. One way to bridge the gulfs is to create higher level interface languages that are specific to the tasks of the user. For gulfs of execution, the user would only have to specify the goal and the system would take care of planning the individual tasks required to reach the goal. For gulfs of evaluation, the system would perform the required analysis to output the exact information to show the user that the goal has been attained. Higher level languages lead to a loss of generality in the use of the machine, so an alternative way to bridge the gulfs is to allow the user to add their own specialized operations using the generic operations available. A third way is just to train users to do all of the processing and planning when using the machine. A fourth way is to adapt the user to think in terms of the system, thereby moving the user closer to the system.


2. Heuristic evaluation can be more beneficial than usability testing in situations where the designer lacks the time and money to do usability testing. Usability testing requires time and money to coordinate the test subjects, test them, and analyze the test data. While the results of heuristic evaluation may not be as accurate as the results of usability testing, the benefits of cost-effectiveness and speed of heuristic evaluation may outweigh the cost of less accurate data. The mobile app VTok (gchat on the iPhone) violates the heuristics of: - Visibility of system status: in moving between screens, the app often displays a blank screen with the small “loading” symbol, but no additional information to indicate the reason for the holdup. At times, I am stuck on this screen for so long that I give up on using the app - Aesthetic and minimalist design: the app displays very distracting ads along the bottom of the screen that block some of the data on the page. - Help users recognize, diagnose, and recover from errors: at times, the contacts page displays no online users, even when there should be online users and there is no way to refresh the page


Shana Hu - 2/12/2014 11:21:32

The gulf of execution refers to the semantic distance between what the user wants to do and what the interface actually does on a low level. The extent of this gulf relies on complexity of the interaction and the implementation of the machine. Typically users must convert their high-level idea into an information-processing structure to accomplish their goal.

The gulf of evaluation is the semantic distance that determines how difficult it is for the user to decide if their goal has been achieved or not. A large gulf of evaluation will require the user to translate the systematic output into something that corresponds to their original intent, which is a burdensome and unideal task.

To bridge gulfs, designers can design their systems to more closely mimic common user goals, or the designer can require the user to accomplish more evaluation. The latter is less effective because it places a larger burden on the user. However, as users use the same system repeatedly, gulfs will be breached due to familiarity with the interface

Heuristic evaluation can bring to light specific intricacies of the system that first-time test users may not come across. Yelp's mobile app violates the felxibilitu and efficiency of use heuristic by opening up to a home screen which shows a bunch of links rather than any directly useful information. The app is also cluttered, which violates the heuristic of aesthetic and minimalist design and causes confusion.


Andrew Dorsett - 2/12/2014 12:00:01

The gulf of execution is the distance between what a user thinks they should do to achieve a goal and what they actually have to do. The gulf of evaluation is the distance between how well a system represents information and functionality compared to the expectations of a user. There are two main ways to bridge the gulfs. The first is for the designer to construct higher-level and specialized languages that make the semantics of input and output languages match that of the user. The second the is for the user to learn to think the same language and build automative responses to to the interface.

One of the main benefits of heuristics is that they are effective and cheap. It's debatable whether they're inferior to traditional usability testing but for the cost/benefit ration can be very good. I use a mobile application called "AutoResizeWallpaper". It's used to resize background images to fit your phone's screen. It violates: 1) Visibility of system status - There are several settings for the images but there is no way to see current settings.

2) Error prevention - It resizes images but for no apparent reason one day the resize will be completely off. Meaning it's extremely zoomed in or tiny. I have reset the image and do it again.

3) Aesthetic and minimalist design - It's essentially a listen of text options with 2-3 sentences underneath. It's cluttered, too much information, and hard to instantly differentiate between options.

4) Recognition rather than recall - There are some operations that I have to set and I can never remember which ones they are. Some options are similar and because I use the application occasionally I have to struggle to remember the correct one.


Dylan McCapes - 2/12/2014 12:08:06

1) The gulf of execution is where the user finds the tools to achieve their goals. This could be through a graphical application, or through language. For instance, a language that provides a large library of classes with descriptions and method lists such as java would have a shorter semantic distance than an instructional language such as scheme.

The gulf of evaluation is where the user interprets the program's output. A shorter semantic distance in the gulf of execution means that it's easier for the user to understand the results. Clearly defining what the program outputs and how it pertains to the input would decrease this distance.

2)

Heuristic evaluation can be more beneficial than usability testing by not requiring users to come in for testing.

An example of a mobile application that violates 3 heuristics is a mobile online shopping app that does the following: displays information on how to use the app on entry screen, uses "Go!" for every action button and doesn't let the user return to shopping once they've continued to checkout. The first heuristic is 'recognition rather than recall'. By displaying app instructions only on an entry screen the user must remember these instructions throughout use. The second heuristic is 'consistency and standards'. If "Go!" means "add to shopping list" as well as "continue to checkout" the user could easily make a mistake. The third heuristic is 'user control and freedom'. By not allowing the user to return to shopping once they've continued to checkout the app makes it impossible for the user to undo the aforementioned "Go!" mistake.


Steven Wu - 2/12/2014 12:21:08

1) The main difference that I notice between the gulf of execution and the gulf of evaluation is what the additional steps are required for the user. With the gulf of execution there is a difference between that the users' intent is and what the system they are using allows them to do and ultimately how well the system can support the inputted actions. It largely reflects, "how much of the required structure is provided by the system and how much by the user". The main takeaway from this is that the way the user interacts with the system can be in a simplified method to the user but "behind the scenes" on the low-level perspective from the system, the system could be constantly performing operations to keep up with the high-level execution from the user's behalf. When discussing the gulf of evaluation, this again relates to how the user interacts with a system but this time from the perspective of what is outputted. As the suffix of the user interaction, the gulf of evaluation comes off as the way the system provides representations of the output intended by the user. Whether or not this is extremely useful, depends on the distance of the gap set up in this gulf. We could see that a system could provide an end state of measuring water level from the reading. With a larger distance, the system could simply provide only the end state for the user, but that would require some mental calculation for the user to determine the difference in the water levels. To instead minimize the difficult and distance put upon the user, the water level measuring system could potentially have an indicator that would return an output that provided the information that the user intended. To find a way to bridge these gulfs, it is best to provide the user with a higher-level language, something that is second nature to the user. This could mean that the interaction language the user undergoes could be described in the same language used in the task domain itself. There is a greater importance of providing a consistency for the users across the interface surface. This could mean that the system that the user interacts with could only require input to certain regions of the system, ultimately giving the user a learning experience of whether the user should come into contact to certain parts. But achieving this bridge is quite difficult given the variety of meanings from our spoken languages and the interaction language. To a certain extent, the coverage to bridge the distance could potentially grow very large and become overbearing towards the user. It is also important to keep in mind that although we as users converse in spoken language at ease, a system to a user's interaction is going involve a translation between high level and low level languages since there will still be a need to fulfill something at a lower level of the system's language itself.


2) Heuristic testing can be applied to more than those individuals who are familiar with the domain of the system. Largely speaking, you have a larger pool of people to evaluate a given system, each potentially coming from a different background and ultimately providing different usability problems that they can find. This is certainly a positive thing since not every evaluator is going to be successful at finding every particular usability problem. Another key difference is that there is an observer who administers the evaluation process and is willing to answer the questions from the evaluators. This is particularly important to the success of the heuristic evaluation since there might be evaluators who are not familiar with the system's domain. By answering the evaluators' questions, this will allow the evaluators themselves to better assess the usability of the user interface with respect to the characteristics of the domain. Another added benefit is the easy of using a system that isn't fully fleshed out. This could mean an evaluator could assess the system earlier in the design cycle using a paper prototype much before the software development cycle begins or the fabrication stages begin for a tangible product. This flexibility in when the heuristic evaluation takes place also comes from the idea that the evaluators aren't using the system to complete a task, but instead simply assess the usability problems, and a working prototype is enough to get the gist of the system's concept.

One mobile application that has usability problems has to be Alien Blue, a reddit client my iOS device. Violation 1) Recognition rather than recall This application is notorious for requiring the users to recall on what their favorite subreddit is and there are no metaphorical objects (like an image of Oski on the desktop version of the subreddit) to help guide the user into finding out where to find the desired content on reddit. Violation 2) Aesthetic and minimalist design There is no negative space. The application is a cluttered list of headlines. And a lot of the time, the main meat of reddit is on the commentary from the peanuts gallery. The ways the headlines are listed throw off the user's intuition to access the headlines when they wished to access the comments and vice versa. It soon becomes a guessing game what the next page will be based on the user's click. Violation 3) Visibility of system status When loading up a linked image on Alien Blue, there is a possibility that the page won't load on the client's built-in internal browser. There is a toggle switch to change between views. Supposedly one is more friendly for the Alien Blue application and the other end of the toggle switch is suppose to be a more traditional mobile browser view. Together the two provide all there is to the browsers' settings and fail to let the user know what the current system's status. Many of the times I find the page that I have opened "stuck" in loading purgatory without much feedback towards the user. There is no loading bar and it becomes problematic for the user to the point that the feel they are obligated to restart the application.



Christopher Echanique - 2/12/2014 12:29:47

The gulf of execution is associated with matching the goals of the user with the commands and mechanisms of the system. The gulf of evaluation deals with presenting a good conceptual model of the system to the user through the system’s output. Reducing the semantic distance can help to bridge these gulfs. This can be done by providing the user with a higher-level language instead of requiring the user to decompose the task using low-level operations. Semantic distance can also be reduced by providing a “What You See Is What You Get” type of interface that displays semantic outputs directly to the user. Another way to bridge these gulfs is to reduce articulatory distance. This can be done by providing an interface that permits specification of an action by mimicking it, such as allowing a user to draw on the screen as input if the user’s intention is to draw an image.

Heuristic evaluation is more beneficial than usability testing because it allows evaluators to independently explore a set of usability heuristics to determine the effectiveness of the interface. In usability testing, the user interacts with the interface without the aid of the observer in order to find usability issues. Many bugs in the interface can be overlooked in this case. Heuristic evaluation provides a more structured approach to evaluation the system based on predetermined principles.

A system that very clearly violates some heuristics is the NextBus app, a mobile application that allows users to search for bus routes and times within their city. Below are some heuristics that are violated:

• User control and freedom o The app fails to provide a way for users to return back to the previous screen. When the user goes through the process of selecting a stop, the only way back is to restart the whole process over. • Recognition rather than recall o The app requires users to remember the intersection of stops, rather than providing an interface for selecting a stop on a map. Sometimes I try to find a bus in an area I’m unfamiliar with; I don’t want to search through a list of intersections to find the stop nearest me. • Visibility of system status o When a stop is selected, it isn’t quite clear which directed the bus is going. It fails to communicate this effectively to the user.


Nahush Bhanage - 2/12/2014 12:44:00

1) The gulf of execution is the difference between the intentions of the users and what the system allows them to do or how well it supports those actions.

The gulf of evaluation is the degree to which the system provides representations that can be directly perceived and interpreted in terms of the user expectations. In other words, it stands for the psychological gap that must be crossed to interpret a user interface display.

The gulf of execution could be bridged by making the system mechanisms match the thoughts and goals of the user. The gulf of evaluation could be bridged by making the output displays present a good conceptual model of the system that is easily perceived, interpreted and evaluated.

This can be achieved in the followings ways -

(a) The designer can construct higher-level specialized interface, ensuring that the semantics of the input and output languages match that of the user. This approach demands significant effort on the part of the system designer.

AND/OR

(b) The user can build competence by learning to think in the same language as that required by the system. This approach requires significant effort on the part of the user.


2) Heuristic evaluation can be more beneficial than usability testing for the following reasons -

(a) Heuristic evaluation can provide quick and relatively inexpensive feedback to designers early in the design process. Usability testing is expensive and generally time consuming. (b) The validity of usability test findings depends heavily on identifying the right target group and accuracy of usability testing protocol to recognize key user tasks. On the other hand, heuristic evaluation involves evaluators examining the interface and judging its compliance with recognized usability principles. (c) Since each observed usability problem is explained with reference to an established heuristic, it is easy to generate a fix. (d) Heuristic evaluations can be conducted early in the design life cycle to find potential usability problems. This makes them considerably easier and cheaper to fix than if the they were to be discovered at later stages.

A mobile application that violates at least 3 heuristics -

I recently downloaded a music streaming application from the play store. I could identify 3 heuristics that this app violates:

(a) "Visibility of system status" - It takes about 4-5 seconds for a song to start playing after it is selected. I presume that's the buffering time, but there's no such indication (usually music players have a buffering indicator). The screen appears to be frozen.

(b) "Help users recognize, diagnose, and recover from errors" - In case there's an internet connection problem and the app isn't able to buffer a song (again, I presume that's the reason), it just gives out a 3 digit error code with no description of the problem.

(c) "Flexibility and efficiency of use" - This is probably a minor one, but it's still annoying. Whenever you launch this app, you HAVE to go through two pages of instructions on how to use it. There's no option to skip these pages. And by the way, I confirmed that these instructions do not include the error code description I mentioned above :)


Liliana (Yuki) Chavez - 2/12/2014 13:08:39

1) The gulf of execution is the distance between a user's intentions for performing a certain action, and what the user needs to do to communicate that action to the interface. The gulf of evaluation is the distance between the feedback given to the user about how the input given by the user registered in the system, and the user interpreting that feedback. Generally, to close the semantic distance in the gulfs requires that the designer/programmer abstracts most of the lower level actions to make the machine do most of the translating (the authors suggest higher level languages as way to bridge the gulf). Another way is to make changes to the interfaces be seen directly to the user.

2) Heuristic evaluation can sometimes model a different aspect of real life situations where one would use the application: for example if you are learning a new application you might have a friend who is more familiar with the application and might be helping you figure out certain functions. In this case, heuristic evaluation might better simulate the user experience of a more experienced user who would still have usability problems even with learned automation.

As an example of a mobile app, I have DAR (Discreet Audio Recorder) that shows a blank screen while recording. Because the whole purpose of the app is discreteness, there is no words or symbols to indicate whether it is recording or not or has simply frozen (this is a violation of the visibility of the system status). It also has a menu that shows all previously recorded items, but it is always difficult to remember how to access, so I end up swiping the screen every which way until I can finally get to the menu (violates the heuristic of recognition rather than recall). While I understand that the whole point of the app is to be discrete, I believe there could be a way to implement discreteness with more usability.


Jeffrey Butterfield - 2/12/2014 13:12:53

Q1) The gulf of execution is the disparity introduced by an interface between a user’s goals or thoughts and the actual mechanisms provided by the system. An interface with a small gulf of execution is one that requires little effort from the user to translate a goal into the respective behavior that makes the interface act in a way that achieves that goal. Conversely, an interface with a large gulf of execution would require much effort to do the same. While greater gulfs of execution do require more cognitive effort for the user, the tradeoff is that other designs with smaller gulfs may lack the level of generality an interface with a wider gulf possesses.

Similarly, the gulf of evaluation is the disparity introduced by an interface between a user’s notion of how the system should respond to input and the expressions of a system’s output interface language. There are many different ways an interface can report the effects of a user’s input; the more cognitive effort needed by the user to interpret any given expression of the output interface language, the greater the gulf of evaluation.

To span the semantic difference of a gulf of execution, the user forms an intention, essentially a meaning of an input expression that satisfies user’s goal. This requires cognitive effort on the part of the user. Likewise, to span the articulatory distance of a gulf of execution, the user creates an action specification by preparing an expression in the input interface language that is equivalent to the semantic meaning of the intention. To span the gulf of evaluation, a user must first interpret or grasp the meaning of what he or she is receiving as output (i.e. span the gulf’s articulatory distance) and then evaluate that meaning by comparing it to the original goal.

Q2) Heuristic evaluation can be more beneficial than usability testing in several ways. With heuristic evaluation, an observer need only record the explicit/literal feedback given by the evaluator; for usability testing, the observer has to interpret the user’s actions to procure information about possible design flaws in an interface. Second, evaluators aren’t trying to perform a real task with the interface, so early prototypes can undergo heuristic evaluation while user testing is usually performed with a functional implementation. This makes heuristic evaluation beneficial because it allows for potential heuristic violations to be caught early in the design process. Finally, users in usability testing cannot receive answers to their questions or hints for encountered problems, as usability testing aims to monitor exactly what hinders a user in a real setting without aid from an observer. Heuristic evaluation, on the other hand, involves evaluators who can provide informed feedback about the usability of an interface even after receiving hints that help them when they are blocked.

The mobile application I have chosen to evaluate is a free iPhone app called Ruler. It allows you to take measurements of objects in the real world by moving your phone alongside the object being measured while keeping your finger in place on the screen. This “drags” a tape measure-like graphic until your finger runs off the screen. You must then place your finger on the opposite edge of the screen to perform another dragging motion. While this app does do a good job matching the system to the real world (by using the tape measure analogy), it violates a few heuristics.

Error Prevention: To quickly scroll to a different part of the “tape measure” graphic, you can swipe quickly with your finger and the graphic will keep moving fast despite no actual finger movement. However, when measuring, if you accidentally move your finger too quickly off the edge of the phone’s screen when completing a measuring drag action, it behaves as if you are trying to scroll through the tape and continues to move the graphic, thus messing up your measurement. To fix this, the design should be modified perhaps by adding separate measure and scroll modes, or even just remove scrolling functionality by adding a “reset to 0” button.

User control and freedom: At 50 inches, the tape measure graphic stops. This means you cannot measure something over 50 inches, an arbitrary restriction that limits user freedom. To change this, the design should not use a static image of a measuring tape but instead mimic the same functionality by updating a distance variable and reporting that as the user measures.


Anthony Sutardja - 2/12/2014 13:47:42

The gulf of execution is distance between completing a task and performing the steps necessary to execute the task on a machine. The gulf of evaluation on the other hand is the distance between what is happening and what the user understand is happening with a machine.

Heuristic evaluation can be more beneficial than usability testing for a variety of reasons. Primarily, heuristic evaluation guides the user (rather than the observer) in answering broad heuristic principles that are widely applicable. This is a shift from usability testing, where the experimenter tries to infer the difficulties a user is having. This allows a detailed evaluation of what the user is actually thinking.

Another benefit of heuristic evaluation is the willingness for observers to answer questions from the evaluators. Rather than waiting for mistakes to happen, the observers guide the user in how to use the interface. This allows more responses about the interface rather than potential domain-specific difficulties.

The iOS native Mail application violates a two heuristics.

First, it doesn't meet the heuristic for consistency and standards. When you open an email in the app, you have a set of icons on the bottom bar while you read the email. If you wish to mark it as unread so that you can view it later, there is a lot of difficulty in finding where this option is. On Mac OS X, the flag icon means to "star" or "favorite" a particular email. However, in the iOS Mail app, the flag icon pulls out another menu, in which you can mark an email as unread, along with a variety of other options.

Second, it doesn't meet the heuristic for error prevention. I have several email accounts managed with the iOS Mail app. When I want to compose/reply to an email, it is not evidently clear which email I am sending it from. There is a view that says a portion of the email it's sending it from, but it does not have room for the suffix (the @{gmail.com,berkeley.edu,etc}) portion of the address. I often find myself sending emails to recipients with the wrong email address, which breaks consistency. That being said, this may be an edge case due to the lengthy nature of my full name as my email address.


Rico Ardisyah - 2/12/2014 14:36:54

Gulf of execution is the difference between the intention of the user toward a device/interface and how well the device support in order to responding the user’s Intention. For example, the user wants to type a documents in a word processor software, and the device will respond to the user’s typing and show the result on the screen. Imagine the screen shows the bits from the assembly language, this will be a huge gulf of execution. Then, gulf of evaluation can be described as the quality of a device/interface provides a representation to achieve the goal that can match user’s intention. The gulf is considered small when the system is able to provide a clear, understandable, unambiguous information that represents the user intention toward the device/interface. To bridge the gulf of execution, the device/interface has to provide commands and mechanisms that match how users think about the device/interface. Using higher level language will be a good choice since letting user deal with lower level language is not a wise decision. On the other hand, to bridge the gulf of evaluation, a device/interface should provide a good display, and it is understandable, easy to use for user. One of the examples is to keep the concept of WYSIWYG (“What you see is what you get”).

Heuristic evaluation is more beneficial than usability testing because the experimenter of heuristic evaluation as more willingness to answer the questions from the evaluators. This will give the evaluators a better assess of the UI. In addition, the evaluators can give more hints to the experimenters in heuristic evaluation. Hence, they don’t waste time because of the struggling in understanding the UI. Indeed, when I evaluated Piazza apps for android, I found it violates 2 heuristics. First, it does not have Help and documentation. Even though the apps does not have complicated feature; just like Jacob Nielsen said, help and documentation is a good supplement for an apps. Second, Piazza does not does not provide a proper visibility of system status. I expect that Piazza can give notification when our followed posts are replied.


Ryan Yu - 2/12/2014 14:42:12

1) The gulf of execution describes the following: essentially, tasks can be broken down into the higher-level instructions of the task, and the lower level instructions that the machine executes to actually carry out the task. What causes issues is when the gulf between user intention and machine execution is very small -- that is, when there is little that distinguishes what the user sees as his/her end goal (from a higher level), and what the machine sees as what it needs to do to achieve the objective. When this gulf is small, this causes confusion and complication in the carrying out of the instructions on the user's part. Some ways to bridge this gulf include the user "generating some information-processing structure," that is, having a widespread general knowledge about the issue at hand before he/she decides to follow through with it. In this sense, "the more that the user must provide, the greater the distance to be bridged." Another way for the user to bridge the gulf includes being strongly familiar with the machine's workings, and how it will decide what to do logically to achieve the task. This may seem obvious, but a little knowledge of the lower level details to the user can go a long way.

The gulf of evaluation refers to the amount of processing that the user must go through to determine for him or herself whether the goal that they have set out to do has actually been achieved. In this sense, the output that the machine (or another individual) spits out may be of a slightly different topic than the user intended the end-result to be -- in this situation, the user must "translate the output into terms that are compatible with the [goal] intention in order to make the evaluation." When a user has to make these adjustments and adapt to differing output, then this makes the gulf of evaluation wider, and it makes it harder for the user to fully appreciate and analyze the results to his/her liking. There are two basic ways to bridge this gulf of evaluation: one comes from the system side and one comes from the user side. On the system side, the system can adapt to the user to look at what format the user wishes his/her result to be in, then compute (or recompute) according to these specifications. For instance, a sort of trivial example would be if the user wants his/her answer in inches and the machine calculates centimeters at first, then the machine would do the appropriate conversions to inches and then spit it out to the user. On the other hand, the user can help to bridge these gulfs as well, and can give very specific instructions/commands to the machine that limit output to the formats/specifications that they want. If these user-given specifications are specific enough, then the machine (hopefully) will spit out answers that are more along the lines of what the user intends them to be.

2) Heuristic evaluation can be more beneficial than usability testing in a variety of different ways. Firstly, usability testing centers on testing your product/application with *users*, which means that the designers/programmers have to go out and find a suitable batch of users that is representative of their target user base. This may be a largely difficult task, and even if they manage to find a sizable group of people, this group of people may not be entirely representative of the people that they are targeting. Furthermore, while usability testing is definitely useful as a whole, it is extremely dependent upon the users who you are using as test subjects. In simple terms, people are erratic. Peoples' habits fluctuate and change on a regular basis, and it is many the case where observing people using your application/product can provide incorrect or hazy results due to simple things like laziness on the part of the user, or a general unwillingness to cooperate.

On the other hand, heuristic evaluation is, in a word, more stable. Although it involves a group of users evaluating your product/application, there are specific guidelines that the users must follow -- in this sense, the users are evaluating your product/application not entirely based upon their own opinions and insights, but also largely upon the guidelines that have been bestowed upon them (i.e. the heuristics.) Because these heuristics are well-defined and have been more or less shown to be good indicators of product/application success, you can be substantially more ensured (in comparison with usability testing), that aspects of your product/application are flawed and need improvement. Testing your product/application on a group of users who you think are your targeted user base is fine, but the fact remains that you need a baseline for evaluation, and heuristics give you just this.

One mobile application that violates heuristics would be LiveNation's mobile application, which enables you to look for events in a certain area and purchase tickets to events. The first heuristic that the LiveNation application violates is "User control and freedom", which states that "users often choose system functions by mistake and will need a clearly marked "emergency exit" to leave the unwanted state without having to go through an extended dialogue. Support undo and redo." The LiveNation violates this heuristic in that once you are "in line" to buy tickets to a highly sought-after event (i.e. right when the event goes on sale,) it puts you in a waiting queue. While in this waiting queue, there is no clearly marked button to exit out or return to the main menu to look for another event. Personally, what I have to do to get out of this screen is just to tap the home button on my iPhone and restart the application. Clearly, this violates the above heuristic, as it does not provide users with an "emergency exit" to return to a familiar page. The second heuristic that the LiveNation application violates is "Help users recognize, diagnose, and recover from errors", which states "error messages should be expressed in plain language (no codes), precisely indicate the problem, and constructively suggest a solution." LiveNation's mobile application violates this principle tremendously during the search for tickets to an event. First, when purchasing tickets to an event that has just gone on sale (and is in high-demand), LiveNation's application will sometimes redirect you to a page that states "An error has occured. Please try again." This obviously provides no indication of what the user should do to mitigate the problem, and is a clear violation of the above heuristic. This error also occurs in the credit card validation that sometimes appears *after* the user has bought their tickets. Upon hitting "verify", the page continuously appears as if it is loading for upwards of three minutes, and then errors out with the same message as above. Once again, this clearly violates the above heuristic.


Nicholas Dueber - 2/12/2014 14:45:16

The gulf of execution refers to the difference in the high level and low level descriptions of a problem. At a high level, the user understand what their goal is, and at a low level, the programmer must make a program that separates the two very distinctly such that the bit shuffling is carried out entirely by the program and the the user doesn't have to worry about the low level bit manipulation. If an application has a large gap of execution, this is a good thing because the users goal and the method of executing the goal are very far apart. The gulf of evaluation is the means by which a user can understand the state of a user interface. This is to say that the smaller the gap in evaluation, the better it is. The user can easily understand the signals that they are being given and can make an informed decision on how to proceed. In order to increase the gulf for the gulf of execution, the user wants to be farther away from the process, that means that the user would like to be executing commands without knowing the underlying coding to a UI. This would suggest more options to execute different commands. To bridge the gulf of evaluation, we would want to add more feedback to the user. This may entail having different lights or different methods of giving the user information based on the state of the machine.

Heuristic evaluations may be more useful than usability testing. Just because an application functions doesn't necessarily denote that the application is good. Heuristic evaluation gives the testing group an opportunity to rate potential problems the user might have. They can explore a much larger range of problems that a user group may have. With usability testing, you will be limited to the exact feedback you get. with Heuristic evaluation you will be able to explore many more potential pitfalls with your application.

The Ted Talk application User Control and Freedom: The application allows you to select a talk you would like to view. If you chose the audio only, it wont let you click another button to download the mp4 over the mp3 as it is downloading. Error Prevention: If you want to scan more talks as you listen to one, you are unable to do so. If you by accident hit the back button to see the other talks. you talk stops. If you select the talk again, then the video or audio recording will start again from the beginning.


Alexander Chen - 2/12/2014 14:46:00

Gulfs are created when there is a need to the user to think about how to interact with a system and when interpreting the feedback from a system.

A gulf of execution is the feeling of being separated from the interface while trying to achieve a goal. Perhaps the user needs to do some additional computation to input data. For example, if a user wanted to keep track of the length of their sleep, and only have this data, yet the system requires a "sleep time" and a "wake time" input, the user might have to make up some fake data, just to get the number of hours they slept recorded. However, perhaps this application is simply not the right one that suits the user's needs.

A gulf of evaluation exists when there is some additional user interpretation required to understand the information presented by the interface and decide whether this information matches his goal. In short, it is the amount of effort the user need to expend the understand system feedback.

Ways to bridge the gulfs of execution and evaluation are plentiful, but each has its strengths and weaknesses. We need to make a decision on how specialized the interface should be. If we wanted to bridge the gulf of file management on a computer, we could provide a file explorer instead of the traditional terminal interface. It may be more intuitive for users to drag files into other folders and the recycle bin, yet some other tasks that could easily be accomplished in the terminal (e.g. cat), would require the user to launch another application in the file explorer interface. This is an example of an interface as model world.

Other tasks, such as programming can be simplified with high-level languages, such as python. We are can use commands that resemble natural language, so the gulf of execution is narrowed. However, the trade off we make is the inability to manage memory and data of lower level structures.

Sometimes, it may not be necessary to do much more work to bridge these gulfs. The most important thing is that the interface allows the users to perform all the tasks to reach their goals. Over time, users become familiar with interfaces they interact with, so their main concern would be whether the interface allows them to get their work done, rather than how to use the interface.

Whereas usability testing is focus on identifying users' goals and how they attempt to achieve these goals while learning the interface, heuristic evaluation employs experts who rank the interface against some set guidelines. Therefore, heuristic evaluation is more standardized and generalizable to the general user base than usability testing.

I remember playing with the Ugly Meter.

This violated the visibility of system status when it was computing your facial structures: once I pressed compute, it would not give any feedback about whether it was processing or it had crashed. It was uncomfortable to know that I could be sitting there waiting for an unresponsive application.

It violated user control and freedom: Once you selected your gender, you could not go back to the main menu without quitting the application. There was no back function.

It violated Help and documentation: I had to click around in the application and even quit the app when I reached an undesirable state. I played around with the application for over 3 cycles before I understood what I had to do. They didn't consider Human Factors~ many prompts looked like aesthetic add ons, so I didn't catch them at first. There was no explanation of what the application was trying to accomplish at each stage.


Steven Pham - 2/12/2014 14:46:57

Gulf evaluation is: the semantic difference between user intention and output language. Having direct feedback after a user does something can help bridge the gap. For example, a beep after a button press or a highlight on a selection of text.

Gulf of execution is: how much information is provided by the interface vs how much work the user has to put in to understand some information about the interface. To bridge that gulf, we can have labels of buttons or interactive elements to tell a user what does what. For example like submit on a button or a on and off on a light switch.

Heuristic evaluation is a bit more concrete and objective than usability testing which is dependent on the people testing it. The Piazza app is crap. Here the following heuristics it breaks: Flexibility and efficiency of use. There are so many unnecessary steps to switch to a difference class. You have to click the current class at the top then a new view appears and then pick the class you want. A drop down or swipe left and right could mediate this. Consistency and standards. The mobile app experience is nothing like the web desktop experience. A lot of assumed things from the desktop experience is not the same as the mobile one.


Emily Reinhold - 2/12/2014 14:49:25

1) Gulfs of execution correspond to the divergence between a user's intentions and how a user must perform their intention given the limitations of the system. The bigger the gulf of execution, the more work required on the user's part to complete his/her desired action. Gulfs of evaluation correspond to the divergence between what a user sees as the output of the system and what this means for their purposes (ie. whether or not they achieved their goal). A good interface design minimizes the size of gulfs of execution and evaluation. Every interface is going to possess gulfs of execution and evaluation because it is just that - an interface. By definition, an interface is the point of interaction between a subject and a system. Since the two entities on either side of this interface are unique, there is naturally some separation between user and the system. This separation is represented by gulfs of execution and evaluation.

In order to bridge these gulfs, effort is required by the system designer, the user, or both. The system designer can explicitly reduce the size of the gulf of execution in a couple of ways:

  • By appealing to what the user already knows. We have seen this in lecture, when we evaluated the iCal application. The interface visually looks like a physical calendar, which is a concept that users have presumably learned and understand. By modeling the application interface after the physical object, the designer relies on the assumption that the user has prior knowledge of how to use a calendar, making the user's execution of actions and evaluation of results easier.
  • By showing the user how to perform key actions. For example, this can be done through a short tutorial that the user must complete before having unrestricted use of the application. In one of my previous internships, the company developed a Wi-Fi analyzing application in which users could conduct surveys of their building to evaluate the strength of Wi-Fi in certain areas. When users first downloaded their application, they showed a short tutorial (which could be accessed again later) about how to conduct a survey and how to interpret the results. I found this very helpful to easily communicate how the user is intended to use the application. While a good UI design should communicate this anyway, it is valuable to have a fallback plan if users don't interpret icons/UI elements how the system designers originally expected.
  • By showing extra information when it is helpful. This helps bridge the gulf of evaluation. When a system requires that the user put in a lot of added effort to interpret the results, users don't tend to like their experiences with the system. For example, when I want to wash the dishes with warm water, I have to put in a lot of effort to get the water to the temperature I want - turn on some hot and some cold; wait for the temperature to become stable; feel the water and evaluate whether it is too hot or too cold; make adjustments and repeat. If instead the manufacturer of the faucet included a temperature indicator, I would not need to put in any effort to achieve the right temperature of water after I learned (probably in the first use) what temperature I like to wash dishes with.

2) Heuristic evaluation can be more beneficial than usability testing when the system designers seek information about their UI design before it has been implemented. Since the experimenters in usability studies generally do not want to help the users/answer questions so they can get a sense for difficulties a real user might have with the product, it relies on the interface being more or less complete. That is, a paper mockup on notecards is not going to be enough to conduct a usability study. This is where heuristic evaluation can be valuable to help drive the design process while the implementation is incomplete.

Further, heuristic evaluation is useful for consistency evaluation. Jakob Nielsen mentions "Consistency and Standards" as one of the ten useful heuristics. The consistency of an application in the context of a predefined platform is important for usability, but is more easily evaluated if evaluators are specifically looking for consistency than if a user is just pointing out difficulties using the product for its intended purposes. For example, in my last internship at Nest Labs, two interns and I spent an entire day evaluating the consistency of the iOS application vs. the Android application. The consistency between these two separate and very different platforms was important for the company image, as they wanted the user experience to be equally good for both iPhone and Android users. Also, the consistency of the Nest application within each of these platforms was important for giving the right feel to iPhone or Android users. For example, the spinners indicating that a page was loading should be consistent with the platform's spinner. A user wouldn't catch this type of inconsistency if he/she was simply evaluating the functionality/ease of use of that individual application.

The Pandora Mobile application violates the following heuristics: 1 - User Control and Freedom (support undo and redo): If a user accidentally skips a song they wanted to hear, there is no way to get back to it. This causes a negative user experience, and would likely cause the user to switch out of the application to another application (ie. iTunes, YouTube, etc.) to play that song they wanted to hear. 2- Aestheic and Minimalist Design: the main screen when Pandora plays music is covered by an advertisement. While this is clearly a monetization technique, all of the aesthetics of their UI are lost when more than half of the image associated with the artist whose music is playing (CD cover, poster, etc.) is covered by an advertisement. 3- Help users recognize, diagnose, and recover from errors: when the mobile device doesn't have service (or is in Airplane mode), the application simply does not provide its functionality. It doesn't let users play/pause a song, skip, etc, and it does not notify the user of what is wrong. This provides a negative user experience since the user is likely to think the problem is with the application, as opposed to their limited connectivity to service.


Emily Sheng - 2/12/2014 15:01:16

Gulfs of executions is the "distance" between the interface language and the lowest level of machine language. If an interface figuratively lies "close" to the low machine language, then users would need to do a lot of work to use the interface. Gulfs of evaluations is the "distance" between the "amount of processing structure that is required for the user to determine whether the goal has been achieved." If the output is not in terms of the user's goals, then the user would have to mentally relate the output values to be in terms of his goals. We can bridge the gulfs from the system side or user side. For example, the system designer can create languages that are more specific and high level for users, or users can become more mentally competent through different thought processes/perhaps training.

Heuristic evaluation can be more beneficial because observers do not need to interpret evaluators' actions -- only record evaluators' comments. Also, for heuristic evaluations, observers are allowed to answer questions, and by doing so, they may be able to better judge the interface usability according to different types of people. In addition, heuristic evaluations can be done with paper prototypes as the purpose is to just evaluate the interface.

The game Minion Rush violates a few heuristics. It violates the heuristic of "aesthetic and minimalist design" as it has a lot of ads and extra features cluttered into the screen that I feel are rarely used. In addition, the game also violates "visibility of system status." Sometimes the screen freezes because it is loading, and other times it freezes and then crashes. An icon indicating that the screen is "loading" would be helpful.


Will Tang - 2/12/2014 15:15:10

1) The gulf of execution is the distance between the user's intents and actions and the actual tasks performed by the machine to satisfy these intentions. For example, the "Turing Tarpit" has an interface language that is very close to the shuffling of bits, and therefore has a very wide gulf of execution because the user's intentions must be translated to complicated instructions that are fed to the machine. The gulf of evaluation is the distance between the actual output of the system, and the expectations and evaluative capacity of the user. An interface that does not indicate enough of the information that a user expects will have a wide gulf of evaluation, because the user must further evaluate the data on their end. One way to bridge the gulf of execution is to map the goals of the user to the mechanisms of the system, and to design the interface such that the user need not convert their goals to more complicated instructions. Good mappings are key to bridging the gulf of execution. To bridge the gulf of evaluation, the system needs to output a good conceptual model of the data that is expressed in terms that the user understands. The user should be able to interpret the data and extract what they need with minimal effort.

2) Heuristic evaluation can be more beneficial than usability testing when the designers are looking to improve on specific heuristics. For example, in a typical usability test the tester is left to interact with the interface, and it is up to the designer to interpret what issues the tester may have. In addition, the tester does not have a prepared set of heuristics that they can use to evaluate the interface. Users in a typical test environment may also be more reluctant to answer questions that heuristic evaluators, partly because they are questioned about mistakes they may have made, and partly because it involves more work on their end. Heuristic evaluation sessions are usually conducted in a more controlled environment where the tester is willing and prepared to answer questions. Ultimately heuristic evaluation creates a more comfortable environment for testers to answer questions, as well as more informed responses for designers.

One example of a mobile application that violates a couple of heuristics is the AC Next Bus app. For the most part, it violates the "Flexibility and efficiency of use" heuristic. The process of favoriting a stop is unnecessarily long, where the user must first select a stop, and then hold down on the tiny heart button that is too close to the edge of the screen. The app then asks to confirm the creation of the favorite, but once it is created there is no confirmation message. This is also a violation of the "Visibility of System Status" heuristic. When the user wants to delete a favorite, they must once again tap the tiny heart icon which requires several tries to open.They can then either swipe left to reveal the right side delete button, or press the edit button at the top left which also reveals another delete button on the left. Pressing this delete button will reveal the right side delete button, which the user will actually use to delete the favorite. If the user chooses to delete through the edit button, they will end up going through several unnecessary steps. This violates the "Consistency and Standards" heuristic because for some reason each favorite has 2 delete buttons and the user may find this confusing. This also violates the "flexibility and efficiency of use" heuristic due to the extra steps. Another issue is the fact that the navigation button at the bottom right of the screen does not zoom in on the user's current location like other apps such as google maps. Pressing this button zooms in on the closest stop, which violates the "consistency and standards" heuristic because the user will likely expect a different outcome from the one indicated by the interface. Lastly, the application does not update automatically. It prompts the user for an update confirmation periodically, which is both annoying and unnecessary. Ultimately, this app sucks.


Vinit Nayak - 2/12/2014 15:26:21

Gulf of Execution is measured to be how much work the user has to put in vs how much work the machine does. The more structure that is provided by the system, the easier it is for the user to adapt to it and potentially could be less error prone. On the other end, the more if the user has to more adjustment to the system, the greater the gulf and more problems could arise due to human error. The Gulf of Evaluation is the amount of work needed to determine if the interface accomplished what the user wanted done. This depends on how the system displays the output, and whether or not it is easy to evaluate success based on the given output. The interface, depending on its functionality, needs to display output in the correct way as to shorten the gulf of evaluation and put less of a burden on the user to evaluate the results.

Heuristic evaluation can be more beneficial if the people observing want to focus on specific domains of the application and get good feed back there. This is done so by allowing the user to ask questions and observers give feedback to them, which in turn also helps them evaluate their predetermined heuristics. In traditional user testing, this is not possible since questions are typically not answered and the user must figure out everything for themselves. The text also states that heuristic testing is more feasible at an earlier stage in the development cycle. This is very advantageous since it can save a lot of time and money if changes are needed to be made to the product. Traditional testing is done towards the end after features have been implemented, which can be costly to go back and fix mistakes on. There was a real time event management application I had used last year called Kango that violated some of the key heuristics required for UI/UX.

1. Match between system and the real world: The application might have been outsourced and had many incorrect idiomatic expressions and phrases. For example, when trying to book a ticket for a sporting event, the row and section labels would always be switched. It also would never make anything plural when it needed to be. 2. Error prevention: There was no error prevention or checking of bad input anywhere in the application. There were many places where putting in symbols and unexpected characters into input areas which would cause the application to crash. There were not clear instructions on what the app expected and there were not enough error checks if the user managed to put in bad input anyways. 3. Help users recognize, diagnose, and recover from errors: This simply was non existent.


Albert Luo - 2/12/2014 15:26:59

A gulf of execution is the difference between what a user thinks is necessary to do what he wants, and what the system actually requires for that intention to be achieved. To bridge the gulf of execution, we can reduce the number of steps the user must take to complete an action, so that the system requirements are simpler for the user and therefore more closely match the user's expectations. A gulf of evaluation, on the other hand, is basically how much feedback the interface provides for the user to understand the state of the system. This can be bridged by providing real-time feedback for actions the user does. We can also make sure it is possible for the user to determine what result would come about from performing an action, and can also add in more elements to indicate to the user what the state of the system is in and where the user can continue from that point.

Heuristic evaluation can be more beneficial when there aren't any users to actually test the application, or if it's still too early for the application to be tested. It helps ensure a minimum level of quality without actual user feedback. I have a reading application on my phone that violates several of these heuristics. One is that doesn't help me diagnose and recover from errors. It simply crashes with a message saying "The application has crashed," with no way to send feedback to the developer and no way to know what's wrong. A second one is a lack of user control and freedom. Often I accidentally click to a switch to a different chapter. Unfortunately, there isn't a way to move to a specific page, so I need to scroll through the whole book to get to where I was previously. This same problem also indicates a lack of error prevention. The mechanism for switching chapters is a finger swipe along the bottom of the screen, but there's no cancel option so by the time I see myself switching chapters, it's too late, and I've already moved far away from the page I was reading.


Opal Kale - 2/12/2014 15:38:39

"Gulf of execution" describes the gap between a user’s goal for action and the means to execute that goal. “Gulf of evaluation” describes the gap that must be crossed to interpret a user interface display, following the steps interface -> perception -> interpretation -> evaluation.

The gulf of execution is bridged by making the commands and mechanisms of the system match the thoughts and goals of the user.The gulf of evaluation is bridged by making the output displays present a good conceptual model of the system that is readily perceived, interpreted, and evaluated. Basically, to bridge both the gulf of execution and the gulf of evaluation, one must minimize cognitive effort. A better interface system also helps bridge the gulfs.In addition, provide the user with a higher-level language, one that directly expresses frequently encountered structures of problem decomposition.Also, The user can develop competence by building new mental structures to bridge the gulfs.

Heuristic evaluation can be more beneficial than usability testing because heuristic evaluation requires only one expert, whereas usability testing needs users, a place to test them and payment for their time. Heuristic evaluation is most beneficial (in comparison to usability testing) in the early stages of design.

A mobile application that violates heuristics is the Padgram app for iPads:

Aesthetic and minimalist design: The Padgram app has too many colors and buttons, and the app is too cluttered for me to just view my Instagram feed, which is the main reason I use Padgram (Instagram on an iPad)

Consistency and standards: The Padgram app has a camera icon and when you click on the icon, it shows you your photo album of all your photos. This is not consistent because in every other apps, when you click on the camera icon, it allows you to take a picture.


Zack Mayeda - 2/12/2014 15:39:10

1) The gulf of execution is the relationship between what the user wants to accomplish and what the machine they are using must do to complete this goal. One way to reduce this distance is to provide the user a higher-level language that they can interact with to accomplish their goal, which is then translated into a lower-level language that can specify what actions the machine needs to take. A drawback of this method is that it is difficult to account for all human intentions in the higher-level language, so even these languages become quite complex.

The gulf of evaluation is the amount of mental effort that the user must undergo to determine if their goal has been accomplished. This distance can be reduced by changing the output display to match the user's semantics or expectations, even if that output wouldn't normally be shown. A drawback to this method is that the output can be very specific and customized to certain user cases.

2) Heuristic evaluation can be more beneficial because the process of answering evaluators' questions during the evaluation let's the designer to gain more information about the usability of the interface, rather than getting bogged down in the mechanics of the interface. It is also beneficial because heuristic evaluation can be performed with prototypes since no interaction with the system is required, simply evaluation.

I chose to evaluate the Instagram app. It violates the 'Visibility of system status' heuristic because by looking at the navigation bar it isn't clear what page you are looking at - the camera icon is always highlighted and other navigation buttons change very subtly. It also violates the 'Consistency and standards' heuristic because the conversation bubble icon is used in two places but for two different meanings - once it represents the comment action and another time it represents a notification. Lastly, it violates the 'Recognition rather than recall' heuristic because any application settings take several steps to find and are often located in unexpected places - even for a small app, there are multiple ways to accomplish the same task and documentation is buried at the bottom of a long menu.


Sang Ho Lee - 2/12/2014 15:41:17

1. The gulfs of execution and evaluation are the gaps that exist between the user's goals and the system state. Interface languages, which encompass the modes of interaction between the user and the system, exist to bridge these gaps. For example, in a word processor, the gulf of execution is bridged by the computer keyboard. The keyboard is a physical representation of the alphabet laid out as a organized set of buttons corresponding to each letter. The gulf of evaluation is bridged by the GUI immediately display the input from the keyboard on the computer screen. The user can immediately evaluate the state of the system and results of his/her input by this interface language. Another example is the comparison between the piano and the violin as mentioned in the reading. The piano's fundamental workings is based on striking a string with a hammer, and the gulf of execution is bridged by the user hitting keys that correspond to the inner hammers. On the other hand, the user of a violin has a much more direct interface language-- he/she takes the bow, positions the hand and strikes the string with the bow. While the gulfs of execution are bridged in different ways, it is important to take into account the task at hand, and in doing so, the level of articulation required by the user. While the piano is immediately more friendlier to the neophyte, the violin offers a more direct interface language that allows for a greater articulation between the system and user, which becomes immensely beneficial to the master violinist who requires control over the subtleties of his instrument.

2. Heuristic evaluation can be more beneficial than usability testing for two main reasons. First, heuristic evaluation often discovers usability problems more quickly than usability testing because it allows the users to voice their problems on the spot. By discovering the problem and helping the user while the user is testing the product, evaluation time is not wasted and the problem is found in the context of the characteristics of the product and the user's interactions, which allows for better solutions. Secondly, heuristic evaluation can be quantitatively measured by constructing and obeying a set of usability principles. A set of usability principles can be immensely useful when building a product for a certain persona, and it allows the evaluator to be able to take into account the variability that arises in user testing.

A mobile application that violates at least 2-3 heuristics is the Weather Channel application. It violates the "aesthetic and minimalist design" heuristic. The screen is simply too full. In an attempt to take up every inch of screen real estate, the application throws in a banner ad at the bottom of the screen, and on top of that display thumbnails of videos. The user may only want the weather forecast, not any of this extra stuff. The app also violates consistency and standards. It looks wildly different from the standard iOS platform's current design conventions. There are no familiar iOS navigation elements, and it may be confusing visually for the user.


Erik Bartlett - 2/12/2014 15:42:44

The gulf of execution is the difference between what the user, at a high level, wants to do and what instructions they must use in order to get the machine to do the action. An example could be generating a list of [1,2,3,4,5,6] in C as opposed to python with a list generator versus a for loop. The gulf of evaluation is the amount of manipulation and evaluation the user must do to interpret the output of the machine and see if it corresponds to what the user wanted to happen. This can be seen as a comparison of a person reading a binary number as opposed to a base 10 number. A way to get rid of these gulfs is to abstract away the details of how a computation happens on a machine level, allowing the user to express high level ideas. Also, you can become more specific with the tasks your device can do, allowing for more assumptions and less user specified input.

Heuristic evaluation allows for more interaction with the user. You can get more detail by interacting with the user while they use the product. There can be clarification about what exactly the problem is, and what exactly the product is supposed to be able to do. When the user has a better idea of what the product is for, they can gauge it in light of that idea of usability as opposed to blindly trying to use the interface.

An interface that does not follow the design heuristics would be the audio system in my car. 1. Does not follow standards of the platform: The tune button switches between the AM/FM functions of the radio, as opposed to being used to change your frequency on a given spectrum

2. Uses recall instead of recognition: To change the levels of the bass/treble/fade the user has to press and hold the same button that is used to power the radio - a function that took me 2 years to find in my car. The same goes with the ability to program stations into the numbers; you have to hold them down while on a station - something that is not a first response to me.


Sol Park - 2/12/2014 15:42:54

1) The gulfs of execution describes the gap between a user's goal for action and what it actually takes to execute that goal. A gulf of evaluation describes the gap between an extra steps to execute the goal and the time a person understands what those extra steps are. To bridge these gulfs, we can use usability and/or Heuristic. Usability reduces the gap by removing steps that requires extra thinking. Heuristic evaluation is a very efficient usability method. These methods can increase the chance of successful completion of the task. 2)Heuristic evaluation is a useful usability method for finding the usability problems in a user interface design. For user testing, one normally wants to discover the mistakes users make when using the interface. Often experimenters help them if necessary. Also, users are requested to discover the answers to their questions by using the system rather than by having them answered by the experimenter. For the heuristic evaluation, that the evaluators does not help them until they are clearly in trouble and have commented on the usability problem in question.

Application: Google Map violation: 1. Aesthetic and minimalist design: I used to love Google Map and i always used them. However recently i upgraded the version and i think new version violates aesthetic and minimalist design. It shows near restaurants and gas stations buttons first rather and the search result. 2. Help users recognize, diagnose, and recover from errors: With new version, it always gives me different kinds of errors and suddenly stops working and forced to close. Any explanation can be helpful. 3.Visibility of system status : With previous version, i was able to search the location with two clicks. With new version, i can see the actual map after seeing four pages(steps)


Everardo Barriga - 2/12/2014 15:44:13

The gulf of execution in essence is the gap between the user’s intentions and the actual execution of the task. The user’s intentions can be articulated in a completely different way than the actual execution of the task which leads to this gulf between the two that must be bridged.

The gulf of evaluation is almost the same as that of execution but instead deals with output. Essentially the the gap exists between what the user expects the output to be and what the user will translate the output to be. For example if the user sees that a scale gives him his/her weight in kg but the user only knows pounds than the user must make the conversion between the two units and a disconnect between the output and the user exists.

There are basically two ways to bridge the gulfs that semantic distance tends to create and one of them deals with abstracting the system. The point is to get to the user side of the gulf by abstracting a lot of the lower level semantics the user shouldn’t know about to get his/her task done. The user should not know about memory management in terms of bits in order to save their file. So it is up to the system designer to abstract these concepts in something that is both legible and understandable to the user who does not have a specialized knowledge about the lower level constructs. Another obvious way to bridge the gulf of evaluation is to provide the user with the output he/she needs in a way that is understandable and in accordance with the user’s intentions. You might run into some problems here because you run the risk of making a system that is too specific and will not scale generally. The time you spend making it special for one particular need is the time you lose in the need of another.

Another way to reduce semantic distance is for the user to learn the necessary skills to be able to work with the lower level aspects of the system. The user can begin to remember how do things rather than have to learn how to do them every time. This requires the user to have an understanding of something he/she otherwise wouldn’t have bothered with but helps close the gap.

I think heuristic evaluation is a much more structured and organized usability testing. I definitely think it can be more beneficial because it doesn’t allow the user to focus on the things that don’t make-up good interfaces, instead it provides the user with a way to articulate themselves in terms of the interface. The user will now be able to talk to the observer in a way that is beneficial to both of them. I also thought the part about putting the user in different scenarios could help tremendously, namely because it’s one thing to test your product on what you think the ideal scenario for the user will be but you never think about how the user might use your product in highly unlikely scenarios. For example one might think about how they’re interface will be used in the rain or snow.

Fifa 14 App Error Prevention I am trying to play a game online but can’t because I don’t have an EA account. But when I open the app I am never asked to log on to my account and so I continue to think that this option of playing games online is available to me but it in fact is not. If I were to state at the beginning that I don’t have said account than the option is gone and so is the error. Help and Documentation There is absolutely no documentation for the app that I can find and in particular there is not documentation for how to use certain controls. Occasionally a dialogue window will show up showing you how to do different moves but there is no central location that I can go to for all of the information.


Juan Pablo Hurtado - 2/12/2014 16:04:05

1) Gulf of execution: Is the "distance" from what the user want to do, for example, "format this paper", to the things that the user has to actually do to accomplish that goal.

Gulf of evaluation: Is the amount of processing the user needs to do to determine whether the goal has been achieved.

2) Because heuristic evaluation is more structured, (you have a base set of heuristic rules as rules of thumb) you can explore more efficiently the space of usability issues compared with usability testing that is more random, because the user has the freedom of doing whatever he wants.

DriveNow: a) Aesthetic and minimalist design: The interface for being a mobile app is very cluttered with too much confusing extra information and features that should be in their webpage instead of the app. For example they have a "vehicle filter" for filtering different vehicles, but they just have one model of one make (at least for San Francisco, I don't know about the other cities that they support).

b) Help and documentation: The help is very confusing, because there is not help for using the app, is more like a how the system as a whole works. This is related with the a) point, because the info of how the system works should be on their webpage instead of the mobile app.

c) Consistency and standards: Because it looks like it isn't a native app, is not very consistent with the Android nor iOS guidelines.

d) Error prevention: It's very lacking in error prevention. For example ,when you are registering and you have to input your phone number it doesn't check the length of it, it just check if you input at least one digit, this means that if the number is 510 432 2456 and I input 510 432 32 it won't tell me that the length is not valid.



Sol Han - 2/12/2014 16:01:34

The gaps between the user and the object are referred to as "gulfs". The gulf of execution is the gulf between the user's intentions and the object's mechanisms ("I want to calculate this" vs. the machine instructions). The gulf of execution is the gulf between the user's intentions and the object's output ("I want to see how this variable has changed over time" vs. a scatter plot). We can bridge these gulfs by reducing the semantic and articulatory distances of these gaps; for example, one can help make the object's language more closely match that of the user's so that the user can easily pick up the object (piano keys have more direct, understandable structures compared to violin strings).

Heuristic evaluation can be more beneficial because having many evaluators to search for problems in a product can help you get a more complete picture about what works and what doesn't for a product (different people tend to use products in different ways). One example of a mobile application that violates heuristics is my AC Transit app that shows schedules for buses. It is not efficient because it requires me to scroll down a long list of bus lines I never take in order to get to information about the 1 or 1R. Furthermore, it doesn't give the user much power to group/save frequently-used bus lines for future reference; often, I won't care if I need to take the 1 or 51B, but this app forces me to check the separate schedules for each one.


Kaleong Kong - 2/12/2014 16:24:50

The gulfs of execution is about the low level bit manipulation. The gulfs of evaluation is about the calculation. There are different ways to bridge this gulf like using a user interface.


Brian Yin - 2/12/2014 16:29:26

The gulf of execution is the disparity between what actions a user believes he or she can take given an interface and what actions an interface actually allows. The gulf of evaluation is the disparity between what a user expects to be meaningful or helpful output and what the interface actually outputs. These gulfs can be bridged in two general ways. 1) from the system side (a.k.a. designing better interfaces by utilizing higher-level languages to abstract away complexity for the user) or 2) from the user side (a.k.a. user learns how to use the system better via automatization). Specifically for the first way, interfaces could have more or better signifiers or utilize symbols/languages already known by users.

Heuristic evaluation may be better than usability testing in cases where you have many evaluators who are already qualified in designing user interfaces. In heuristic evaluation, the evaluator has the brunt of the work in analyzing an application. In order for that to be effective, the evaluator needs to know what they are looking for (a.k.a. are familiar with interface design and good heuristics). Heuristic evaluation may be more useful because they may go faster, as evaluators may be helped through an interface as needs be.

The mobile application I analyzed was the Quora mobile application. A couple heuristics the application my have violated include: 1) Consistency and Standards: Under the 'Browse' window, there is a list of unanswered questions. Below that is a button that says 'Shuffle'. I imagined that 'Shuffle' would randomize the list of unanswered questions displayed, but instead directed me to a random question. 2) Recognition rather than recall/User control and freedom

The application often refreshes the home page which causes posts that I may have wanted to read later to disappear. No functionality exists to undo the refresh. This forces me to remember what the questions/answers were about.


Gavin Chu - 2/12/2014 16:31:32

The gulfs of execution and evaluation are the degree of understanding of actions vs. interpretations based on the user's intention. The gulf of execution is determined by how responsive an action is. For example controlling the screen by moving a mouse has a small gulf of execution because the action performed on the mouse reflect directly to visual result on the screen. The gulf of execution can be bridged by simplifying actions to match user behavior. The gulf of evaluation is determined by if the app accomplishes what the user expects, or did the app serve its purpose. The gulf of evaluation depends largely on semantic, or meaning. It evaluates the execution to see if a goal is reached. The gulf of evaluation can be bridged by also simplification, meaning if the feedback from execution is easy to understand, then evaluation is also easy.

Heuristic evaluation can be more beneficial than usability testing when the designers want to figure out what the app is missing. Heuristics are basically a checklist of requirements/problems. Heuristic evaluation is more organized than usability testing because it focuses on discovery problem pertaining to a category.

An mobile app that violates a couple of heuristics is Snapchat. The "user control and freedom" is very limited when editing a photo. The app only allows undo for drawing, but sometimes I want to simply erase stuff. Snapchat also fails the "consistency and standards" heuristic for its manage page. Snapchat recently added these new features that allow users to add filters, time stamps, weather stamps, etc. However the manage page that allows user to enable each of these features don't really have any explanation of what features are added when enabled or even how to perform these features when enabled. This also leads to a lack of "help and documentation." Although Snapchat does provide an online documentation, I still can't figure out how the speed smart filter works! The additional features I mentioned above are not clearly explained in the documentation, so it's hard for user to become an expert of Snapchat unless they experiment with it a lot.


Cory McDowell - 2/12/2014 16:38:14

1) The gulf of execution is bridged by making the visual aspects of a direct manipulation interface work the way a user would think. The gulf of execution is bridged by making the output of a program useful so that it reflects that the input has been understood and evaluated upon.

One way to bridge the gulf of execution is shown in the program Logisim, where each tool is created to visually illustrate what it will do. For example, addition and multiplication blocks have ‘+’ and ‘×’ signs to illustrate exactly what the block does.

One way to bridge the gulf of execution is also shown in Logisim. We are given an output to our system, but when Logisim is running, it shows how the input is affected by each element in the system, so we know that the output used our entire circuit.

2) Heuristic evaluation can be more beneficial than usability testing because we evaluate an application on a consistent set of metrics. In usability testing, the evaluation is biased towards what the evaluator feels is important, not the important factors established in a list of heuristics.

A mobile application that violates two heuristics is Snapchat. The first violated heuristic is “User control and freedom.” When drawing on a photo, the user has the option to undo the last drawn path. However, the user cannot redo their action, violating this principle. It also violates “Recognition rather than recall.” When selecting who you are going to send your snap to, users cannot see the photo they are sending. They must remember this photo, violating this principle.


Kevin Johnson - 2/12/2014 16:39:50

The gulf of execution is the distance between the way the user attempts to execute commands to manipulate the system and the system's interface for accepting command input. Essentially, the gulf of execution refers to system input. The gulf of evaluation is the distance between the way the user expects to receive information from the system about its contents and the way that the system provides that information. Essentially, the gulf of evaluation refers to system output.

The gulf of execution can be bridged by ensuring that the input mechanisms match the way the user already thinks. For example, Python allows boolean operations to be constructed like a sentence: "if this or that". This has a lower distance than requiring a more arcane command to represent "or", like "||". The gulf of evaluation can be bridged by formatting output in a consistent manner that matches the order of the information to the order in which the user looks for the information. Most airline ticket confirmations have an unnecessarily large gulf of evaluation that would be improved by simply reordering the exact same information so that the most relevant information came first.

Heuristic evaluation can find a wider variety of problems than usability testing, and it can do so more cost-effectively. The Android Google Maps application violates several heuristics. It violates the visibility of system status heuristic by making it ambiguous whether the directions are based off of your current location and time, or whether they are previously stored directions which are no longer accurate. It violates the help and documentation heuristic by including features which are not properly explained, such as what routing based on "best route" means relative to other routing options. It violates flexibility and efficiency of use by making it difficult to "bookmark" commonly used locations; I had to install a separate application to perform that task efficiently.


Tristan Jones - 2/12/2014 16:40:25

1) Gulf of execution is about how difficult it is for a user to have the computer do what s/he wants it to do. Gulf of evaluation is about how difficult it is for a user to figure out what the computer meant when a signifier comes from the device. There are several ways to solve these gaps. One of them is skeuomorphism, where an application's UI resembles its real-world counterpart. For example, a note-taking app may look like a notepad since users are familiar with it. This helps reduce the gulf of execution because people are familiar with physical devices and the familiarity transitions over. Another way to bridge these gulfs is to have a quick tutorial for the user to familiarize him/herself with the program. After running through the tutorial the user has a better understanding of what the program does and how to interact with it.


2) Heuristic evaluation can be more beneficial than usability testing when you have 1) don't have much resources to perform user testing 2) can only test a few number of people and 3) your target testers are already very familiar with the product and know how to avoid you're product's pitfalls. I still believe that heuristics cannot replace usability testing but I see them helpful during the initial stage product development cycle. One app that has many of these issues is Fitocracy for iPhone. It was one of the first mobile tracker apps to come out and has many, many issues. To enter your workout, you need to select the exercise from a large search list. However, the default search list only includes the 200 most popular exercises. If it's your first time looking for an exercise, you will get extremely frustrated trying to find and exercise that you're sure is in their exercise database. This is an example of a failure to Help users recognize, diagnose, and recover from errors. Another example its horrible Visibility of system status. Sometimes if you hit the home button while it's submitting your exercise list, it will silently fail and when you go back all your exercises are copied twice. If you try to submit now then you'll get twice as many exercises and you have to manually delete them. Furthermore, successive iterations have made it less Flexible and efficiency of use. When they redesigened the app, instead of it taking one tap to duplicate a set, it now takes 4 taps and if you mess up it takes another three to fix it. This makes it very annoying to continually enter your sets and creates a poor user experience. It is quite evident that the authors of the app have not taken this class.


Ian Birnam - 2/12/2014 16:45:58

1) The gulf of execution refers to the gulf between a user and their goals and knowledge. This gulf is bridged through making the commands and mechanisms of the system match the thoughts and goals of the user. The gulf of evaluation refers to the gulf between a user and the level of description provided by the systems with which the user must deal with. This gulf is bridged by making the output displays present a good conceptual model of the system that is readily perceived, interpreted, and evaluated. For both of these gulfs, the overall goal is to minimize cognitive effort.

For example, to bridge the gulf of execution, you should decrease the amount of information-processing that a user must do in order to use an interface. To bridge the gulf of evaluation, you should decrease the amount of processing structure that is required for the user to determine whether the goal has been achieved. The output must be displayed in terms such that the user knows the goal has been or has not been achieved.

2) Heuristic evaluations can be more beneficial than usability testing for a variety of reasons. Each evaluator inspects/uses the interface as an individual, and only once the evaluating is complete do they get to talk to and hear comments/complaints from other evaluators, therefore allowing uninfluenced feedback. In a heuristic evaluation session, the responsibility for analyzing the user interface is placed with the evaluator, so the observer only needs to record the evaluator's comments, and doesn't need to interrupt the actions of the evaluator. The evaluator can also ask questions to the observer, who can then explain things or give hints, which may give better feedback then just leaving the evaluator in the dark.

An app that violates some of these heuristics is the Yelp mobile app. For starters, it violates the "consistency and standards" heuristic because you could search for "italian," "Italian," or "italian food," and the user may get confused that these three queries will give different results, even though they shouldn't. Another heuristic it violates is "recognition rather than recall." For example, when selecting a restaurant, the address, directions, and phone number are all displayed. However, upon clicking the "more info" button, this information doesn't carry over as well. This requires the user to navigate back and forth, instead of just having the info screen have all of the info.



Vinit Nayak - 2/12/2014 16:46:09

Gulf of Execution is measured to be how much work the user has to put in vs how much work the machine does. The more structure that is provided by the system, the easier it is for the user to adapt to it and potentially could be less error prone. On the other end, the more the user has to adjust to the system, the greater the gulf and more problems could arise due to human error. The Gulf of Evaluation is the amount of work needed to determine if the interface accomplished what the user wanted done. This depends on how the system displays the output, and whether or not it is easy to evaluate success based on the given output. The interface, depending on its functionality, needs to display output in the correct way as to shorten the gulf of evaluation and put less of a burden on the user to evaluate the results.

Heuristic evaluation can be more beneficial if the people observing want to focus on specific domains of the application and get good feed back there. This is done so by allowing the user to ask questions and observers give feedback to them, which in turn also helps them evaluate their predetermined heuristics. In traditional user testing, this is not possible since questions are typically not answered and the user must figure out everything for themselves. The text also states that heuristic testing is more feasible at an earlier stage in the development cycle. This is very advantageous since it can save a lot of time and money if changes are needed to be made to the product. Traditional testing is done towards the end after features have been implemented, which can be costly to go back and fix mistakes on. There was a real time event management application I had used last year called Kango that violated some of the key heuristics required for UI/UX.

1. Match between system and the real world: The application might have been outsourced and had many incorrect idiomatic expressions and phrases. For example, when trying to book a ticket for a sporting event, the row and section labels would always be switched. It also would never make anything plural when it needed to be. 2. Error prevention: There was no error prevention or checking of bad input anywhere in the application. There were many places where putting in symbols and unexpected characters into input areas which would cause the application to crash. There were not clear instructions on what the app expected and there were not enough error checks if the user managed to put in bad input anyways. 3. Help users recognize, diagnose, and recover from errors: This simply was non existent.


Dalton Stout - 2/12/2014 16:48:26

The gulf of execution can be throught of as the semantic distance between the user's intention and the machine's actual instructions. The wider the gulf, the more careful the planning and instructions that must be issued to the machine. On the other hand, the gulf of evaluation refers to amount of processing that is required in order for the user to determine that their goal was achieved. We can decrease the gulf of execution either on the designer side or the user side. On the designer side we could write a high-level language in which the outputs of the language matched the expected output of the user. With regards to the gulf of evaluation, the text recommends something similar to "WYSISYG" systems in which the output of the program is forced to show semantic concepts directly.

Heuristic evaluation is more beneficial than usability testing when one desires an organized, unbiased test suite. Heuristic evaluation is helpful because the tests can be run by someone with no knowledge of the design principles of the system, they simply need to interpret the heuristic output. Also heuristics are unbiased evaluators because they do not rely on human nature or fallibility for testing. Heuristic tests can also be organized to evaluate only specific features or areas of the application which can be much more beneficial than user testing in a development environment.

Mobile Application: Transit Hero Violation 1) Aesthetic and minimalist design- This app includes various functions (such as the option to hail a cab) on every single screen, when the use of the cab function is sporadic at best. Violation 2) User control and freedom- It becomes a major pain in this app if you create a new 'Trip' but click the wrong destination by accident. Not only is there no 'Undo' but it takes navigation through several screens to fix the issue. Violation 3) Help and documentation- There is no help or documentation available through this app.


Brenton Dano - 2/12/2014 16:51:39

1) The authors use the term "gulf" to represent the separation or distance between the user's goals and knowledge and how well the system they want to use is described to them. There are two types of gulfs and they each go in one direction only. If we look at the gulfs as a the space between the user's goals and the physical system, the gulf of execution goes from the goals to the system while the gulf of evaluation goes from the system to the goals of the user. The gulf of execution is used to describe the work the user must do to make his instructions interface on the system. The gulf of evaluation on the other hand is the steps the user must go through to interpret what the system gives back to him in the form of data or readings.

The gulf of execution can be bridged by making the commands and mechanisms of the system align with the thoughts and wants of the user. The gulf of evaluation can be bridged by changing output displays to present a clear and sensible conceptual model of the system that is easy to interpret by the user without much cognitive effort. The reading also mentioned using higher level languages to bridge the gap. Another way that was mentioned is that the user should change their conceptualization of the problem to match the way the system thinks about it. Example of this for the refrigerator air flow example would be to think about the system as a chiller that pumps air into a pipe and a gate that you can allocate a certain percentage of the air to the freezer and the rest to the fridge.

2) Heuristic evaluation can be more beneficial than usability testing because it forces the tester to focus on the small set of "heuristics" instead of just pointing out whatever they see to be wrong. It helps because it makes the tester clarify the issues that they face and directly match them with a violation of a certain usability heuristic. The reading also recommends a certain sweet spot of evaluators. Too few evaluators and you won't catch enough of the usability problems. If you have too many, when you already have enough evaluators the more you add just cost more money and they might not find enough problems to be worth the price.

My dumb phone has a web browser called MetroWeb. The UI is horrible. It violates lots of the heuristics but I'll just list a few here.

"Helps users recognize, diagnose, and recover from errors" is violated because whenever an error happens it never suggests a solution. It will just say something like "Error load fail" but won't say why. Sometimes it won't load because your connection is too slow. Other times, its because the page is too large to fit in the small amount of memory of the phone.

"User control and freedom" is violated because if you want to quickly close the browser it won't let you! This is super annoying and a really bad design. For example, if I am loading a page and my phone starts to lag I want to emergency exit. It won't let me. Instead it lags like crazy and then when I finally want to close it I will get a stupid pop up message that asks me if I won't to close it. But it won't respond to my button click until the page loads which is like never so I just have to take my battery out of the phone and put it back in again.

"Visibility of system status" is violated because the progress bar only has 3 positions. Not full, half full, and full. But it doesn't really represent the state of the browser page loading. Sometimes the bar is full, but not everything is loaded, and there are broken links. Sometimes the bar is nonexistent and the page is half loaded or just simply freezes!

Maybe it's time for me to buy a new phone~


Aayush Dawra - 2/12/2014 16:53:13

The Gulf of Execution is essentially the difference between the work put in by the user and the work done by the machine. In order to bridge this gulf, the semantic distance, which reflects how much of the required structure is provided by the system and how much is provided by the user, should be minimized so that the factor of human error introduced because of user-provided structure is minimal. For instance, if the user needs to adjust to the system a whole lot, more problems may arise because of the user's erroneous tendencies. The Gulf of Evaluation is the amount of work needed to validate whether the interface accomplished the goal the target user wanted to achieve. This largely depends on the manner in which the system displays the output and also on whether it easy to perform the evaluation once the output is available. In order to bridge this gulf, the information needed for evaluation should not simply be in the output but also in a form that directly fits the terms of the evaluation. For instance if user's intent was to note how the amount of water changes in a tank, it would be good to have the amount of water currently in the tank but in order to shorten the semantic distance between intentions and output and to bridge the Gulf of Evaluation, it would be considerably better to directly have the rate of change displayed on the screen thereby reducing the mental workload of the user.

Although usability testing has many benefits, it loses out in some respects to heuristic evaluation when considering that the number of steps involved in usability testing lead to a more time intensive evaluation model. Since more man hours also mean higher costs, heuristic evaluation scores over usability in cost-effectiveness as well. Apart from speed of turnaround and cost-effectiveness, heuristic evaluation scores over usability testing in another key area. If, for instance, the designer has identified the best heuristics to evaluate his application keeping the target user group in mind, then heuristic evaluation could be far more insightful than usability testing, since the users would be unaware of the design goals of the application that the designer would be privy to.

The mobile application I considered for heuristic evaluation was be Snapchat. The first violation is of the 'Visibility of the System status' heuristic, since on opening Snapchat there are no interface elements to let you know how to proceed or what the application is for. Also, this principle is again violated since you do not have a dedicated friend list for Snapchat, it shows all the people who you've sent a request to and who might not even have accepted your request. The second violation is of the 'Help users recognize, diagnose, and recover from errors' heuristic, since Snapchat simply crashes when anything goes wrong without any error notification identifying what exactly went wrong. In fact, recently there was a massive bug that made the application crash as soon as a user logged in without any notification and this affected about 20% of all Snapchat users. On a related note, the third violation is of the 'Help and Documentation' heuristic. In the garb of minimalist design, Snapchat provides very scant documentation to the user and the user is basically relying on a hit and trial method in order to figure out key aspects of the application. Consider for instance the screenshot feature available to Snapchat users. It took me 2 months to figure out what exactly happened when, in response to one of my Snapchats, I got a "Screenshot" notification from one of my friends and it was ultimately one of my friends who told me so. There was essentially no documentation to smoothen the initial user learning process.


Patrick Lin - 2/12/2014 16:53:22

The two gulfs are obstacles to the feeling of directness when interacting with an interface. To minimize the distance between the user and the task at hand, designers should aim to reduce both the gulf of execution, which represents the amount of work needed for the user to translate his goals into physical requirements, and the gulf of evaluation, the effort required to understand the system's conceptual model. They are both resolved by making commands on the interface match the user's thoughts and using an appropriate (higher) level of interface language that serves as a model for the system. This involves creating displays of information that are not too detailed or general and instantly match the display to the user's semantics (e.g. command line editors). The gulfs can also be bridged if users adapt to the representation the designers build.

Using evaluators to find usability problems with the full knowledge of the workings of an application, possibly allowing them to test and suggest updates to features that would go unnoticed by users. Observers are able to provide more help to evaluators than they would be in normal user tests, and also do not have to work as hard to interpret the user's actions and any usability issues he may struggle with.

Piazza App: 1) Aesthetic and minimalist design: the first page the app opens up has a grid layout populated with every class I've ever taken rather than just my current ones, and there is no obvious way to filter them out other than dropping all of my old classes. It "should not contain information which is irrelevant or rarely needed", because while it is possible I will reference an old Piazza post for some information, it is rarely needed.

2) Flexibility and efficiency of use: There is little to no option of customization in the app. Every new post you create requires selecting in all the categories manually, and searching for posts requires navigating to a separate search option, or using the less intuitive folders and filtering system, which often return strange results.


Maya Rosecrance - 2/12/2014 16:54:36

The gulf of execution is the bridge between a user’s intention and the actual execution of the intention. The gulf of evaluation is the process the user has to go through to determine whether or not the goal has been reached. It can be bridged by having clear indicators of when a goal is achieved such as a success message or an audible sound. A gulf of execution can be bridged by having more powerful tools so the user has to do less to achieve his goal. A heuristic evaluation gives a different kind of insight as since the tester is commenting on the interface directly and can ask the observer questions, much more the tester’s opinion comes across without having to be interpreted by the observer who may be biased. This kind of evaluation is much more valuable when the things being tested are not directly quantifiable and must be measured through the tester’s opinion. The iBooks app violates 2 of the heuristics. First it violates the Recognition rather than recall heuristic as it does not automatically bookmark the place a user is at in the book. Therefore the user must remember which book he/she was last reading and then remember the page number. It also violates the Consistency and standards heuristic. Typically books are organized by subject or genre but instead iBooks has an ambiguous “Categories” tab which does not map to genre or subject.


Armando Mota - 2/12/2014 16:56:41

1) Gulf of Execution - the space between user intention and the task being completed on the interface. The amount of effort a user must put in to be able to translate the way they think about a task to completing the task on the interface is a result of this gulf. For example, if the language is machine language as opposed to a normal GUI navigation “language", considerably more effort on the part of the user is required to be able to learn the vocabulary necessary to initiate the same action in machine language as a GUI. Gulf of Evaluation - the difference between what a user considers to be an affirmation or failure of success in achieving their goal and the output the interface gives the user telling him/her what kind of result was achieved. The more the user must interpret, analyze, and extrapolate from the output provided by the interface, the bigger the gulf. The most obvious way to bridge this gap is to involve a high-level language that describes the action in a way closer to that which the user would normally describe it in, allowing the interface/machine itself to translate from high-level to low-level language and bridge the gap. If the high-level language is constructed in such a way as to facilitate easy implementation of this high to low level translation, the burden on the user is relatively light. It is important to have a high-level language that is both general enough to be consistent across many domains and allow the user to be creative with their expression, yet specific enough to allow the user to clearly implement what they intend. Another way is to make the output of the interface match (or get closer to) the user’s expected confirmation. This way faces the same issues as the input side, so a balance between specific enough to meet the user’s semantics and general enough to be widely useful is key. Users may also change their own semantics and intentions to match an interface’s - as long as its resulting model and structure is readily apparent to the user, the user may be able to adapt his/her views to match it. Designers can take advantage of articulatory distance, and make the distance between the representation of a language/its constituents and the physical representation of these constituents in the world smaller. Things like onomatopoeia achieve this in natural language

Heuristic evaluation bypasses the biases and interpretation of the observer - in usability testing with an observer, the observer interprets the users’ actions and infers the data. Thus, usability requires more work at that step than heuristic evaluation. Heuristic evaluation also allows answering evaluators’ questions - this makes the evaluators’assesments (the evaluators, remember, might not be experts or even novices in the interface domain) more about the interface’s usability in full, as opposed to the limited functions they know how to use. Another advantage is that heuristic evaluation need not require a working prototype - it can still be in paper or some other form because the people aren’t required to actually be using the interface. This allows this type of testing to be done early in the design process. 2) Transit LA, an app I use for bus travel when I go home to Los Angeles, violates multiple usability heuristics. The first is “consistency and standards”. Most bus or public transportation apps (the ones I use up here thankfully abide by this standard) either begin by presenting you with a list of nearby stops and their associates busses and arrival times, or have a main menu that allows you access that feature should you want to. This app goes completely against that standard and offers, as its opening page, a list of favorite routes. When you first use this app, it is not initially clear what you should do at this point. Violating the standard in this aspect was a poor choice. It is incredibly inconvenient as a traveler, or someone who is not using this app from the same bus stop every day. And truly, the app has a much larger range of use for someone who is attempting to access public transportation from a location unfamiliar to them because presumably a person who takes the same bus every day at the same time will remember what times and busses to take and will not require an app. The second is “user control and freedom”. The app is inconsistent with its page back and undo buttons. It has three menu options on the bottom of the bar which are ever-present. These options are “favorites”, “map”, and “more”. These allow you to always go to that area, however there are also, at times, back buttons on the upper left hand corner. It is not clear which buttons to use at which times to get to the screens you want to go to. Some screens that would benefit from having a back function don’t, and much of the app accesses an external website (that is, it accesses a website within the application), in which case going back in the website can sometimes be tedious. The third is "recognition and recall”. This heuristic is violated in a somewhat softer way than the previous two, however it can still be considered a violation. Because the home screen is a list of your favorite routes, it is necessary for use the map function to find a route should you need to on the fly. The problem with this is that in the map feature, you must zoom in to such an extent that you can only see roughly 2-3 square blocks. This is an issue because, there are often a range of busses that are available from different areas, and many of these busses might not be available in that 2-3 block radius, but right around there. So, in order to find these, you either have to remember every bus route in order to navigate to the proper area of the map where the bus runs, or you have to just blindly move the map around until you find an appropriate bus stop. Remembering bus routes you don’t normally take aren’t the kinds of processes most people will want to do.


Prashan Dharmasena - 2/12/2014 16:58:11

1) The gulf of execution is essentially how hard the user has to think and how much he/she has to learn to use your interface. An example of a small gulf of execution would be a children's toy, where as a lot of specialized software tends to have a greater learning curve and thus a greater gulf of execution. The gulf of evaluation is a measure of how complex it is for a user to determine if the action they wanted to perform succeeded. One way of bridging these gulfs is by putting in place a "higher level language." Essentially simplifying the interface and making actions more identifiable by the user. An example of this would be a program like Scratch, where the user drags and drops blocks instead of writing code. Another way is by making the output of the interface display more of what is directly needed by the user. An example could be displaying a graph of things over time instead of making the user watch a single number as it fluctuates.

2) Heuristic evaluation can be more beneficial because it allows for users to report problems with an interface quickly and easily. It works best for an interface that is based off of existing systems, as the user will have a better feel for the interface. App: TinyScan - Breaks consistency and standards heuristic. When adding pages to a scan, the user must tap a button with a camera icon. This was not clear to me at first use, and I had to ask a friend to make sure it did what I thought it did. Another case was with a button with an arrow in a circle. I assume this meant "redo/retake picture" but in fact it rotates the picture, which now makes sense, but wasn't consistent with what I was familiar with. - Breaks help and documentation heuristic. Even though it has an FAQ, it is meant more for advanced actions and does not have much explanation for the aforementioned buttons. - Somewhat breaks user freedom. On one hand, it prompts the user before they make any permanent changes, but on the other hand, has no way of undoing or redoing certain actions such as deletion of a page.


Ravi Punj - 2/12/2014 16:58:58

1) The Gulf of Execution is the symbolic difference between the user's goals and mechanisms of interaction. For eg, there is a gulf of execution when the user types keys on a keyboard to make letters show up on the screen. The guld is bridged better if the user can use a stylus to directly write on the screen, thus the method of interaction mimicks the user's goals.

A gulf of evaluation arises when the output displayed to the user cannot be readily perceived in a meaningful way. For example, it's easier to read graphs and bar charts rather than go through a spreadsheet and understand the relations between numbers. Therefore, in this example, graphical representation of numerical data help bridge the gulf of evaluation.

2) Reuters iOS app:

a) Heuristic evaluation can be more beneficial than usability testing as it associates the violated usability principles with the usability problems. While that does not lead to direct design solutions, it makes the task of reiterating of designs easier by giving the process a goal to achieve (the principle to adhere to).

The Snapchat application violates the following heuristics: a) Snapchats sometimes just spring up, so no indication if I'm receiving a snapchat or not. This violates the "Visibility of system status" heuristic. b) I take a Snapchat; before sending it, if I press the back-button, I lose the snapchat. Similarly going into the settings/contacts page, it is hard to tell what the back button does. Sometimes it takes you to the list of snapchats, and sometimes it takes you to the camera. This violates the "User control and freedom" principle. c) The app has no help or documentation in the app, and the user must go to their website to seek support. This violates the "Help and Documentation" principle.


Seyedshahin Ashrafzadeh - 2/12/2014 16:59:21

1) The interface that we want to design might introduce some distances such that there will be a gulf between the user’s goals and knowledge to the level of descriptions needed by the system. There are two types of gulf. One is the gulf of execution and the other is the gulf of evaluation. The gulf of execution can be bridged by making the commands and mechanisms of the system to match the goals of the user. The gulf of evaluation is to make the output of the system present a good conceptual model of the system that is being evaluated. In this process we are minimizing user’s cognitive effort. One way to bridge the gulf between the intentions of the users and the specifications by the computer is to provide the user with a higher-level language. Therefore, matching a language with the task domain can make the tasks easier. In this case, the user can avoid all the planning and transformation and pass those to the machine. Another way is to change from line-oriented text editors to screen oriented text editors. This is good because the feedback is immediate and the user can instantly see the changes. 2) In user testing, it is the evaluator’s willingness to answer questions and the extend at which he/she can help the user. In these type of evaluations, the evaluator is usually there to observe the user and discover their mistakes, so he/she might be reluctant on answering the questions of the user. However, in the heuristic evaluation, it is unreasonable to refuse to answer the evaluators’ questions. If evaluators are non-domain experts, answering their questions will enable them to better assess the usability of the user interface with respect to the characteristics of the domain. Therefore, I think that heuristic evaluation is very beneficial if the evaluators are non-domain experts. In this case user testing would not give as much information as heuristic evaluation. Also, the time of evaluation is very precious and by giving the evaluators hints we would prevent wasting time that could happen in user testing. The mobile application that I would like to talk about is messaging. One of the heuristics that this application sometimes violates is user control and freedom. The autocorrect feature of this application would usually interfere with the intentions of the user, and the user is forced to undo and erase all the letters and retype them and somehow signify the application of the way that he/she wants. The voice feature of messaging (when you tell the message you want to send) is not always consistent and standard. This feature sometimes get the correct input and sometimes it is totally incorrect. Also, in my Galaxy Note Message application, it does not have Emoji. This violates matching between system and the real world because it does not follow the real-world conventions. A lot of users would like their messaging app to have emoji but this application doesn't.


Christina Guo - 2/12/2014 16:59:38

1) The gulf of execution is the layers between the actual physical bit manipulation that executes what the user wants, and the interface that the user interacts with to get their goal accomplished. The gulf of evaluation is the amount of effort or processing that the user needs to do to the output of the program to figure out whether his or her goal has been accomplished. These bridges can be gulfed in a variety of ways. One option would be for the designer to construct higher level languages that more closely match the pre-existing knowledge base of the user as well as the goals of the user by using words or language from the domain of the problem. Another option would be to place the burden on the user. The user can adapt to the representation that the system uses. While this requires greater effort from the user, it may also empower them with new ways of thinking about their goals because of their increased knowledge of how the system works.

2) In some ways, the facebook mobile app violates some of these heuristics. It violates User Control and Freedom in that there are many functions that the user can not edit and redo (that may sometimes be possible in the web version). For example, if you want to edit a status to tag someone but have already submitted it, there is no way to edit it in the mobile app, even though this functionality is present in the web application. Similarly, this violates Consistency and Standards because users who are used to the web version of facebook will have expectations of functionality standards that aren't present in the mobile app.


Matthew Deng - 2/12/2014 17:01:46

The gulf of execution is the distance between the user's intention and the instructions given to the machine. The gulf of evaluation is the distance between the output of the machine and the user's understanding of it. For example, if a user wanted to find out how much money they would have to spend to paint a room, an application that would have a narrow gulf of execution would ask the user to input the dimensions of the room, while an application with a wider gulf of execution would ask for the total surface area of the walls to be painted, which would force the user to calculate on their own given the dimensions. Going on with this example, the application would have a narrow gulf of evaluation if it outputted the price of the paint needed to paint the room, and a wider gulf of evaluation if it simply outputted the volume of paint needed, from which the user must then calculate the price. These gulfs can be bridged in multiple ways. First, we look at semantic distance, which separates the user's goals from the meanings of expression. The gulfs can be bridged from two ways: the user end and the machine end. From the user end, the user can learn to think in the same way as the machine, so that the user's inputs will match the machine, and the output will match their goal. Likewise, from the machine end, the machine can be redesigned to specifically fit the users intentions. In addition to semantic distance, there is articulatory distance, which separates the user's form of expression from the meaning of expression. The gulf of execution can be bridged by allowing the user to mimic the actions they want to do. On the other end, the gulf of evaluation can be bridged by having the machine's output to be easily understood by the user.

Heuristic evaluation can be more beneficial than usability testing not only because it saves money and time, but also because it can effectively be used early on in the development process to steer the project in the right direction. One mobile application that I have used that violates a few heuristics is the AC Next Bus application. First of all, it violates the "Error Prevention" heuristic. Every time you load the application, it asks if you want to update the stops; however, every single time it says "Error updating stops - please try again." Going on from this, it violates the "Help users recognize, diagnose, and recover from errors" heuristic in the sense that it tells you to try the "Update Stops" option again, but will fail once again. However, you can go to the main screen and press "Last Updated: HH:MM:SS" to actually update the stops. It is not very clear that this is meant to be a button, which causes the application to be someone lacking in the "Help and documentation" heuristic as well.


Lauren Speers - 2/12/2014 17:02:00

The gulf of execution refers to the distance between a user’s goals and the system’s input language that the user must use to achieve those goals. This gulf can be bridged either by requiring the user to translate his thoughts and goals into the system’s language (for instance, by requiring the user to write commands in a low-level language), or by designing the system such that the interface accepts input in a form similar to that of the user’s goals (for instance, by providing a high-level language that matches the user’s vocabulary). On the other hand, the gulf of evaluation refers to the distance between a user’s knowledge and the system’s output language that the user must interpret to understand the results of his actions. Like the gulf of execution, the gulf of evaluation can be bridged either by requiring the user to learn the machine’s language (for instance, requiring the user to remember a previous measurement and subtract it from a new measurement to determine the rate of change), or by designing the system’s output language to deliver information to the user in his language (for instance, by calculating and displaying the rate of change for the user). Both gulfs can be minimized by decreasing semantic and articulatory distance.

Heuristic evaluation can be more beneficial than usability testing in the early stages of an interface design. Because heuristic evaluation does not emphasize functionality testing, it can be done before the system behind the interface is completed. In addition, because heuristic evaluation compares an app to existing and predefined heuristics, the violations uncovered are very specific and well-articulated, making the redesign process easier.

The YouTube mobile application violates 3 heuristics: 1. User Control and Freedom: The search functionality does not provide an “emergency exit.” If the user accidentally searches for a misspelled or incorrect topic, he has to wait for the results of his incorrect search to load (which sometimes takes quite a while) before he can search again. This problem is fairly prevalent, but fortunately is not hard to recover from as long as the user is patient. 2. Recognition Rather than Recall: When the user uses the search functionality of the app, the search suggestions and keyboard occupy the entire screen. If the user wants to make a search related to a video he just watched, he must remember the keywords since they are no longer visible when he searches again. Again, this problem is prevalent, but does not have a major lasting impact on user experience. 3. Flexibility and efficiency of use: The mobile application does not permit easy access to suggested videos, a feature many users use on the corresponding website. This problem is prevalent and has a large impact because it is difficult to overcome.


Bryan Sieber - 2/12/2014 17:02:51

The gulf of the execution is the gulf or bridge between the user’s intentions to the action required for it to happen. Machines of low complexity have wider gaps between user intention and machine instructions. The user must be experienced with the abstract machine’s instruction set to be able to properly use it. Semantic directness/distance is the measurement of how much the user must know about the system to be able to properly and usefully interact with it. The gulf of evaluation is the gulf or bridge between the intended outputs from the user’s actions to the actual output that the system provided. The user is then tasked to translate the output of the system to the intended output subjected to evaluation. This is an added and unnecessary burden for the user to get to the goal. Although these gaps exist, there are multiple ways to solidly bridge them. A few solutions include: using high-level languages, making the output semantic distance less, automating behaviors, having the user change their conceptualization of the system, and analysis of the nature of the task. Providing the use with a higher level language can enable an ease of use and more easily consider common problems faced by users. The high level language would allow users to easily create the structure between intentions and expressions in the language. When the semantic distance of the output is less the system is more WYSIWYG. The systems would need to be developed solely for use with particular domains/tasks, and the diction used in the interface language must be mapped well with the user’s diction in the domain. Although automation isn’t a way to reduce semantic distance, it is a way of increasing directness. This means that a user can easily see how processes are accomplished by watching a skilled user, but may not be able to derive (or rederive) how something was accomplished. The act of automation is a process of memorization of performing a set of actions for the correct/wanted output. Changing a user’s conceptualization of the problem can enable the gap between the user and system to be minimized, because you are in essence moving the user closer to the system.

Heuristic testing has more coverage than traditional usability testing. This is attributed to the concept that heuristic testing allows people who are not familiar with the system's domain to properly assess the system. By following this, the evaluation can bring a diverse set of reports. For example, in the readings it mentions that certain usability problems appeared more common among certain evaluators. As an added benefit of not having evaluators take part in the assessment within the system's domain, is the ability to communicate with the observer. The observer can answer some questions provided that the evaluator asks a question that is relevant to the problems of the system. One final benefit I noticed in particular was the ability to evaluate at the stage where the design of the system hasn't fully finished. By this I mean that the system can be assessed as early as a paper prototype has been created. We keep this in mind because heuristic evaluation doesn't necessarily mean we are trying to accomplish a task with the current state of the system, but instead to debug the system from another person's perspective. Bay Tripper, a Bay Area public transportation application. First Violation: Match between the system and the real world. Opening up the application I have a map San Francisco, but instead I am currently in Berkeley. This system fails to communicate with my language by ignoring the real world uses of "why" I am using the application. Second Violation: Flexibility and efficiency of use. There should be a way the application can learn from my history of public transportation lines. Especially since I tend to use the 51B to get around Berkeley, the application should have cache-like usability where it would remember my previous interests and work to provide a similar list instead of listing every single bus line on the AC Transit. Sometimes only the relevant information is really necessary. Third Violation: Visibility of system status. This one is crucial since a lot of the times when I will be using this application I will be underground at a BART station. Internet underground doesn't really work well and at times when I did first use this app, I wasn't aware certain regions in the underground station wouldn’t provide internet connection. There needs to be an immediate alert on the application that tells me that it doesn't know the current time schedules of "said transportation" the moment it has lost network connection.


Justin Chan - 2/12/2014 17:03:21

A gulf of execution is the “gap” that a user has between the time he/she wants to carry out an action and the time the action is actually executed. It can be summarized with the popular saying “easier said than done.” The gulf of evaluation is the “gap” that a user has in interpretation the state of the system. Per the paper, the gulf of execution can be solved by basically translating your intentions to the language of the input. Clearly the example that comes to mind is a programming language. If I want to keep repeatedly visiting a page so it appears that my website is very popular, I can write a script in Python to do so. The gulf of evaluation can be bridged by making sure the system’s output is more in tune with our internal language. A good example of this is when a car has a problem and it flashes a red light. There’s a very small chance we could probably find the problem ourselves by routine, random inspection – the red light is our language for “something is wrong,” which alerts us.

Heuristic evaluation can be more beneficial than usability testing when testing with “normal” people who may not be as well versed in usability. This is because the former is a more structured approach that gives you a number of issues to base your testing around while the latter is more of a “free-form” way of testing. With heuristic evaluation, you already know what to look for before you test, which helps to guide the actual test because you are (sub)consciously looking for those heuristics. Additionally, having heuristics makes post-testing evaluation a lot easier because you can categorize the problems you are having, which will come in handy when you are inevitably discussing design changes.

An app for me that surprisingly violates 2 heuristics is the popular Mailbox app. Whenever I try to swipe my mail into the trash, it sometimes gets archived instead because the phone thinks I have not swiped “long enough.” (Disclaimer: the Mailbox app basically tries to simplify everything by making all actions a series of swipes, the “duration/direction” of the swipe being how the phone tells what action you want. Archiving and deleting an email both require a swipe to the right – the former requires a “longer” swipe than the latter.) This for me violates the consistency and standards heuristic because the differentiator of “swipe duration/direction” is not that good of a differentiator to begin with. The difference between archiving and deleting an email is huge – they both should not be assigned a rightward swipe with such a small thing differentiating the two.

The app also violates the efficiency of use heuristic because it is very difficult to delete multiple emails at once, something many people need to do in the morning when they wake up to a bunch of spam. With Mailbox, all you can do is continuously swipe emails away, which is a time-wasting process if you know that you’re not going to read some of those emails at all. I want to be able to select multiple emails and delete them at once. Thanks, Groupon.


Peter Wysinski - 2/12/2014 17:07:42

Any sort of interface inherently introduces distance and there are gulfs between, ‘a person’s goals and knowledge and the level of description provided by the systems with which the person must deal.’ The gulf of execution is bridged by having the commands the system uses match the end goals and visions of the user. The gulf of evaluation is bridged by having the program display to the user something that is easy to understand and evaluate. Both of these gulfs strive to make interaction with a system less obtrusive. Another way to bridge these gulfs is to increase the amount of structure provided by the system; in doing so the user must provide less though into using the system which in turn makes the distance bridged shorter. Yet another way to bridge the intentions of a user and interface is to, “provide the user with a higher-level language, one that directly expresses frequently encountered structures of problem decomposition.” In short this means that the task of translating from the user’s language to the program’s should be done by the system itself, not the user.

Heuristic evaluation is a method for finding usability problems in a interface design so that they can be remediated during the iterative design process. For it to be successful it requires feedback from many evaluates; studies have shown different people to find different usability problems. Each evaluator initially inspects their findings alone to ensure that feedback from others doesn't interfere with their results. Another key difference between typical usability testing and heuristic evaluation is the willingness of an observer to answer questions from the evaluator. In heuristic evaluation answering evaluates questions will allow them to better understand, ‘the usability of the user interface with respect to the characteristics of the domain.’ A mobile application that violates two heuristics is Snapchat. There is a lack of ‘Consistency and Standards’ in the photo edit screen. Users are initial unsure which button is responsible for drawing on an image and which is responsible for adding text; furthermore there is no ‘recently contacted’ list of users which is an Android system standard in every application. The application also violates the ‘User Control and Freedom’ heuristics by having no way to alter the modifications done to an image - a user has to take a new photo an start over once he makes a mistake.


Meghana Seshadri - 2/12/2014 17:10:01

(1) The gulf of execution marks the difference between what intentions users have for using a system and what the system actually allows them to do or how well it supports said intentions. Whereas, the gulf of evaluation is the process of the user analyzing and determining whether the goal or system state has been achieved. If the output of the system do not match the user’s original intentions, then the user needs to translate the output such that it becomes compatible with said intentions, such that the evaluation can be made. Both gulfs of execution and evaluation focus on the differences between the intentions and goals of the user and the expectations and abilities of the system. The gulfs of execution can be bridged by having commands and techniques that are translated from the user’s intentions. The gulfs of evaluation can be bridged by translating whatever the system outputs into a form that the user understands.

(2) In usability testing, it is the norm to discover the mistakes that users make when using whatever interface they’re testing, and because of this, experimenters want to give as little information as possible. This leaves the users to answer their own questions as they continue to test out the interface. In heuristic evaluations, however, by answering the user’s questions about the interface provides them the ability to better test the usability of the interface and in regards to the interface’s characteristics. Furthermore, in this sense, heuristic evaluations are more cost-effective than usability testing, as experimenters aren’t wasting time with users who struggle to use the interface. The fandango mobile application violates a few heuristics given in the provided article. It violates:

(a) Aesthetic and minimalist design: While searching for a movie or theater that the user wishes to attend, there is a lot of information about that movie or theater that’s crunched along with its title. The application tries to clutter all sorts of information related to the movie or theater that it believes the user needs to know immediately, when it is probably something better off viewed when that theater is chosen or as a separate drop down search menu feature.

(b) Flexibility and efficiency of use: While the whole point is to sell users movies tickets, it takes an awful long time to actually get to the point of purchasing the ticket. Users must go through the process of selecting a variety of choices before even coming near to purchasing. This distracts the user from reaching their end objective, and even causes the harm of putting out the user before they can even get to that point.

(c) Recognition rather than recall: As a user is going through the process that the Fandango application has lined up for the user to buy movie tickets, the user must go through various number of pages of selection (movie, time place, rating, etc), and when they reach the final step of purchasing, they might have forgotten what time they had chosen or which theater. They must remember which options they are selecting along the way, and with the number of selections that must be made in order to obtain a movie ticket, this can easily confuse the user.


Doug Cook - 2/12/2014 17:11:20

The gulf of execution can be described as the figurative distance between the user’s intention and the machine-level execution of instructions. The gulf of evaluation refers to how much effort the user must place into interpreting the interface’s results (to understand if their goal was met, for example). This type of separation between user and machine can be addressed by means of language. Semantics that are written in terms the user understands and are directly related to their goal will help them gauge what the machine is asking and what it is doing. Another approach to these issues involves choosing “higher-level” languages – meaning a dialogue that looks like what the user would use to describe their goal to another person. Of course both of these ideas raise further issues (generality creates more work for the programmer) so there is a tradeoff present.

Heuristic evaluation can be more useful than usability testing because it provides both a forum for input from multiple observers, and a specific set of “guidelines” (heuristics) that must be thoroughly explained. These facilities produce more specific and logical issues because the observers will be less subjective to their own tastes, and identify violations with the most gravity through discussion. One mobile application that I’ve observed violating some of the heuristics is the “How to Cook!” iPhone app. The relevant heuristics are: 1. Consistency and Standards. The app shows wildly varying interface elements as you navigate through pages, and I sometimes wonder which element will return me to the previous page given that it changes throughout the app. 2. Recognition Rather than Recall. The app displays cooking ingredients and instructions in “sections” spread out across different views. This forces the user to remember quite a bit of information, which is especially annoying if their hands are busy with kitchen appliances. 3. Aesthetic and Minimalist Design. This app is quite verbose and each screen is full of chunky design elements.


Namkyu Chang - 2/12/2014 17:11:40

1) Give brief definitions of gulfs of execution and evaluation. What are some different ways to bridge these gulfs?

The gulf of execution refers to the distance that must be bridged between the user’s intention (e.g. scroll down the page) vs. how much of the structure is provided by the system. The gulf of evaluation on the other hand reflects the amount of processing structure required to see if the intentions have been met by the user. In Hutchins, Hollan and Norman’s reading, the authors state there are two main ways to bridge the gulfs: one from the system side by the designer, and another by the user. The former method is when the system is being built, by implementing higher-level languages “move toward” the user, and making it easier for him/her to use. The latter is when the user makes an effort to construct new “mental structures” and becoming a more apt user.

Heuristic evaluation is performed by having each individual evaluator inspect the interface alone. Only after all evaluations have been completed are the evaluators allowed to communicate and have their findings aggregated.

An example I can think of where the semantic distance in the gulf of execution is my SmartTV at home. The basic functionalities are built in such that changing channels and adjusting the volumes are very easy. However, to use some of its features like watching a Youtube video is a hassle. You have to go into the menu, look for the “Internet” option, and from there navigate to “htttp://www.youtube.com/” manually. From there, you can save the link to a bookmark, but may be more of a hassle than entering the url in manually everytime. (e.g. where does the bookmark go once it’s saved? Took me a good 15 minutes to figure out the bookmark menu is outside the “internet” menu, which was very frustrating). The reason why I say the gulf of execution is high in this example is that my higher-level intention (go to youtube) requires multiple steps to do. To lower this, I could adjust as a user to be more apt, or the system designer could have recognized that this might be a popular website that will be visited on the TV, and make a quick shortcut to this site.

2) How can heuristic evaluation be more beneficial than usability testing? Describe a mobile application that violates at least 2-3 heuristics. List each heuristic along with a short explanation of the violation.

Heuristic evaluation can be more beneficial than usability testing when you’re on a budget. According to Nielsen’s article, “heuristic evaluation is explicitly intended as a ‘discount usability engineering’ method.” More so, compared to the X amount you saved, the results are surprisingly comparable to regular usability testing.

An app which violates some heuristics is the MLB.com app for iOS. The first violation is “Visibility of System Status”. When trying to switch tabs to the video section, the previews take some time to load since they are larger files. However, there is no indication in the main app that it is trying to load anything at all. While loading, the page is left as blank, leaving the user clueless as to what’s going on. (Granted, in the iOS system bar, it indicates that something is loading but this is not apparent in the main app itself.) Another violation I found was for “Help users recognize, diagnose, and recover from errors”. Because there are no MLB games playing in February, the scores section is empty. However, playing around with the app under that tab will give an error, but the user has no idea what happened. This is because the app crashes without warning. After reloading the app, it starts on the homepage as if nothing happened. There is no way for the user to determine what went wrong, no error messages in code nor plain language.

Although Nielsen makes heuristic evaluation very attractive, there are some problems with this method of evaluation too. One argument can be seen in Figure 1 in his article How to Conduct a Heuristic Evaluation, which shows that the method highly depends on the evaluators. Although the test results are aggregated, some hard-to-find usability problems will not be seen unless there’s a high data sample or very skilled evaluators.


Charles Park - 2/12/2014 17:12:46

The gulfs of execution is the gap between a user’s goal for action and the means to execute that goal. It’s the difference between the user’s perceived execution actions and the actual required actions. The gulfs of evaluation is the degree to which the system or artifact provides representations that can be directly perceived and interpreted in terms of the expectations and intentions of the users. It’s the difficulty of assessing the state of the system and how well it supports the discovery and interpretation of that state. The gulf of execution is bridged by making the commands and mechanisms of the system match the thought and goals of the users. The gulf of evaluation is bridged by making the output displays present a good conceptual model of the system that is readily perceived, interpreted, and evaluated.

Heuristic evaluation can be more beneficial than usability testing because one is looking for specific traits rather than usability testing. As a heuristic evaluator, I have a set of goals I’m looking for and I’m more likely to identify the underlying issues because I’m specifically looking for them. Furthermore, I’m looking from an engineering perspective and can find the targeted issues rather than coming across them at random, as in usability testing. One mobile app that seems to violate a few of the heuristics is the new Facebook viewing app, Paper. First it violates the User Control aspect in that the user cannot seem to control which of the “highlights” are to be seen. While the user can choose whether or not they want to follow a certain person, the randomly generated(?) main board highlight is chosen by the app and there is not much the user can do to change that with the exception of time. A While aesthetically pleasing, the design does not allow for the user to change the layout or anything, except for content the visible content. Hence it goes against the Flexibility.


Anju Thomas - 2/12/2014 17:13:54

1) Give brief definitions of gulfs of execution and evaluation. What are some different ways to bridge these gulfs?

Gulf of execution manly defines the relationship between the goals or intentions of the users and how well the system supports the actions intended or performed by the user. Gulf of execution emphasizes the minimal machine complexity needed to bridge the gap between the user’s goals and actions. This span between the user’s target performance and machine instruction is often resolved by more organization, planning and interpretations from the user’s side. The space between the user and machine can be bridged by having the user create an information processing unit.  The increased effort from the user’s side to make the ends meet reflect the great distance to be gapped.

An example of the gulf of execution involves a user wanting to record a video using a camcorder. The relationship between the user’s intention in recording and the ease with which he / she can control the system to perform the desired action describes an example of the gulf of execution and the importance of bridging the distance between to allow the user to easily manipulate the system in an efficient way.

Unlike the gulf of execution, the gulf of evaluation describes the state of the system and the way in which it effectively conveys the steps required to be performed by the user to reach their goal or decide whether previous actions can bring them closer to their goal . The writer emphasizes the gulf of execution as the amount of processing structure that is needed for the user to evaluate whether the desired output or result is accomplished. In a direct methodology, this means that the result matches the user’s expectations.  

To span the gulf of evaluation, the user is expected to interpret the result in a way that conforms to the user’s needs.  For instance, this is especially noticed in the following example where the user wishes to manipulate the airflow form the conditioner but cannot see the temperature output. The user in this case can evaluate the air temperature inside the house and mentally compute the differences felt in coldness of temperature to assess the result. Here the information needed by the user is in the output of the system but the has to perform the calculations to determine the result. If the change of room temperature is visually shown, it reduces the effort needed by the user and reduces the distance in user’s intention and output language.

The two ways to reduce the semantic distance can be either from the programmer, where the effort is comes from the system and the other where the user is expected to contribute more effort, where they develop more competence by creating new mental structures. The first way  mentioned by the writer to bridge the gulf between the user’s goals and specifications required by the computer is by using higher level languages, which allows the user to directly define the task in the same language as used by the task domain. Though it requires more effort from the designer to express a task in a high level language, the effort needed by the user can be greatly decreased by avoiding the need to plan and interpret the results. Higher level languages not only allow the user to close the gap between the interaction of the user with the program and the users expectations, but also helps by giving the user greater flexibility in expressing their needs. . Yet, another way to reduce the gap is by displaying the output semantic concepts directly. This can be done by through direct feedback to the user of the rate of flow, which usually is not provided.

An example where the gulf of evaluations has been effectively gapped can be seen in the transformation to screen oriented text editors from line oriented ones, where the user can continue to keep track of their edits without relying on other methods.

Another example demonstrated was by the improvement of spreadsheets which allows the users view the changes in values and update the status of the system.


2) How can heuristic evaluation be more beneficial than usability testing? Describe a mobile application that violates at least 2-3 heuristics. List each heuristic along with a short explanation of the violation.

Usability testing involves the observer evaluating the user's actions in order to determine the problems associated with the interface design. It enables user testing even if the users are not familiar with the user UI design. Despite the benefits of Usability testing, heuristic evaluation gives the observer the flexibility of evaluating the evaluator's comments rather than individual users. Here the observer does not need to interpret the evaluator's actions as well, but simply record their another is the willingness of the

Another difference between the two is the observer's interest and extent to which they can answer the evaluator's questions. Whereas in usability testing, the observer is more hesitant to answer questions to analyze the problems faced by the user. Also, unlike the evaluator, the user is expected to find out the problem using the system itself. In the same time, the evaluators might need help especially if they are nondomain experts or are not familiar with the interface, so precious resources might not be wasted.

Another benefit of the heuristic evaluation is the cost. Using the relevant number of evaluators helps find various kinds of usability issues which could hinder the popularity of the product and prevent user's from using the products efficiently. Though having 3-5 evaluators costs some, the profits from creating a better developed software can save more.

Heuristic evaluation can also be performed on existing paper prototypes and have not been developed. This allows experimenters to use heuristic evaluation in the early stage of the design process unlike user testing which requires the users to try the prototype.

A mobile application violates some of the heuristics is the Youtube IOS app by Google. It includes various design issues, such as the following :

video playlists are missing useful information, such as the number of videos in the list, or important information that users need to know. This causes the user to recall information about their past selections than recognizing through visible options. This also violates visibility of system status, where the user is unable to track the status or information of their playlists.

Another flaw on the design of the app is that it contains two places that a user can sign in, one of which is shown on the main page and the other in the Guide menu section. Blind users are required to login in from Sign in button in the menu under the Guide menu, as the main sign in option functions incorrectly with VoiceOver.  This violates two principles, one where the user is limited of the sign in option and another is that it doesn't follow the same platform convention of one sign in feature. This could also violate the principle of aesthetic and minimalist design, where there is no real need to have two sign in features.

A third flaw of the design is the difficulty users have with changing their subscription status. Instead of tapping to subscribe and double-tapping to unsubscribe the user . This violates the heuristic where the user control and freedom is limited.


Hao-Wei Lin - 2/12/2014 17:17:55

1. Gulf of execution is the gap/discrepancy between the commands, mechanism of a system and the thoughts and goals of the user (his/her conceptual model). To bridge this gap, programmers have to work hard to design the mechanism in a way that would make sense for the user using the application. Gulf of evaluation is the gap between the visual display and the conceptual model of the app. This gap is closed by doing a lot of testing, using the application and evaluating the outcome of the application.

2. Heuristic evaluation seems more beneficial because it breaks down the potential problem that an application has into various aspect (mechanical, efficiency, aesthetic, user control, error prevention, recoverability, etc), while usability testing put a lot of emphasis on troubleshooting, on evaluating how bad is the problem and whether or not it is within the users' ability to overcome. Although it is certainly not a bad approach, and usability testing may do the job of ensuring that the flow of the application works and that the user can use the application without too much trouble, it doesn't consider other ways to improve, ironically, the usability of the application. Usability can be affected by, say, aesthetic. A minimalistic layout is much better than a distracting, visually overwhelming layout even though they both might have the same functionality because the goals are clearer to the user. Usability testing doesn't care about improving in such aspect, yet heuristic evaluation does (especially when there are a lot of evaluators).

A mobile application that violates some heuristics: NextBus

Heuristics it violates: Error Prevention— whenever there is an error, for instance, when Internet connection is lost, instead of reconnecting to the Internet, the user has to manually quit the application and restart to fetch the data. The developer of the application did not implement Internet reconnection during runtime of the app

Aesthetics and Minimalistic Design— the design is extremely uninteresting. The color usage of the text is very similar to the background, making it hard for the user to recognize important information such as the where the bus stop is located and alternative bus routes.

Visibility of System Status— the application has a poor feedback system (in a sense, NO feedback system). The waiting time on the app simply doesn’t change on its own unless the user does the pulling-down-the-screen-for-reloading action (on iOS), and there is no signifier for that functionality. I used to think that I have to quit and reload the application every time I want an updated time schedule. Not until one time I accidentally discovered this feature did I start to do the pulling action for refreshing the page.


Zhiyuan Xu - 2/12/2014 17:21:36

The gulf of execution is referred to as the gap between an individual's goal for an action and the ways of executing said goal. For example, to be able to print out an article from the computer, one must turn on the computer, install the driver needed for the printer, hook up the printer, go on the Internet, and click on the appropriate options to print the page. The gulf of evaluation is referred to as the amount of processing structure required for the user to determine whether the goal has been achieved. To bridge these gulfs, one must reduce the semantic distance. On the designer side, a higher-level and specialized language can make the semantics of the input and output language match that of the user's. On the user side, one can learn the method of thinking required by the system.

Heuristic evaluation can be more beneficial than usability testing in that the method can find more usability problems more cost-effectively. Evaluators can be trained in heuristic issues and evaluate based on those categories. An example of a mobile application which violates usability heuristics for user interface design is Grooveshark. It is a music streaming application that allows users to search for song names, artists, genres etc. and stream the music. It violates the consistency and standards heuristic as there are numerous items on the menu that seems to be for similar purposes. For example, one can choose to organize music into the collection folder, or the favorites folder--except both seem to be implemented in the same way, with different icons. Furthermore, the stations and broadcasts section of the application seem to be designated for the same purposes. It also violates the error prevention heuristic--the application is very prone to crashing, and often times while the music is streaming, the screen indicates that music is playing but there is no audible sound output.


Sangeetha Alagappan - 2/12/2014 17:21:45

1) The gulfs between a person’s goals and knowledge and the level of description of the system they’re dealing with is a distance introduced by the interface called the gulfs of execution and evaluation. The gulfs are unidirectional and need to be bridged to increase the feeling of directness and interaction when a person uses a system. To bridge these gulfs, a better interface that requires less cognitive effort to use needs to be designed such that commands and mechanisms need to match the users’ thoughts and goals (which would bridge the gulf of execution that goes from goals to system state) and the output displays a good conceptual model of the system that’s easy to perceive, interpret and evaluate (which would help bridge the gulf of evaluation that goes from system state to goals). Further the gulfs can be bridged by reducing the semantic and articulatory distance either from the system side (in which the designer can construct higher level languages to match the semantics of the input/output languages to that of the user as well as making use of analogies and WYSIWYG - examples include spreadsheets and folders which map to the users’ vocabulary) or the user side (in which the user develops competence by creating new mental structures; in a sense they adapt to the system and move closer to the system to bridge the gulf).

2) Heuristic Evaluation can be more beneficial than usability testing, especially in the early stages of development. Heuristic evaluation uses a small set of evaluators to assess the usability and discover flaws in an interface. Since the evaluation is done by a group of experts who have used and assessed a variety of interfaces, their evaluations are usually very helpful in improving the interface. Also, because we use the mean of their severity ratings as well a compilation of all the flaws they’ve pinpointed, we get a relatively unbiased overall opinion as individual preference become outliers and common consensus informs further interface changes. Also, heuristic evaluation is more through and addresses more severe issues first and can often find problems that most often average users will not in their cursory interaction with the system. The brainstorming that may follow is also a well-informed, productive discussion as they channel their expertise and observation to improve the design.

The Flickr App isn’t always the easiest application to use. Here are a couple of heuristics it violates: a) Consistency and standards: It is often very difficult to figure out what certain signifiers will do in the Flickr app. The back button doesn’t, like usual platform conventions take you back to your last activity, but depending on the context it scrolls you through comments and photostreams. It takes a couple of tries to understand how to navigate between different activities and what certain actions mean. b)User control and freedom- In figuring out what certain actions and icons do, I mistakenly pinned something from Flickr onto my phone’s home screen and have no idea how to immediately undo it or unpin it. Many elements in the Flickr app like adding comments as well don’t have a quick fix if you mistakenly add a comment or want to opt out of an activity once you’re in. There’s no immediate “emergency exit” or undo.


Max Dougherty - 2/12/2014 17:22:08

A “gulf of execution” refers to the theoretical “distance” between the intention of the user taking the action, and the actual instruction the machine executes. In reference to the direct manipulation interface, which attempts to minimize this distance, the width of a gulf of execution can create a relative measurement of the usability of a program. This translates to the effort exerted by the user to enact a desired action. The “gulf of evaluation” is then the amount of work required to establish if the desired action had the desired effect. With the design of a “direct manipulation” interface, the gulf of execution can be reduced by giving the user the sense that they are personally handling system objects. To bridge the gulf of evaluation requires an increase in feedback. However, feedback alone does not impart understanding, thus feedback must be of a semantic form the user can readily interpret.

Heuristic evaluation differs from usability testing by altering the dialog between the evaluator and the observer. In traditional testing, the observer does not offer help when asked. In heuristic evaluation, by offering help when requested, the observer can understand the difficulty the evaluator is having and can use that information to infer a usability issue. This also allows the observer to understand the thought process of an evaluator, ideally illuminating the root and possible solution to a usability problem. Furthermore, heuristic evaluation does not require the entire system to be tested, but rather allows focus on specific features of the application. This can allow ongoing testing as other portions of the project are still under construction.

Mobile Phone App (Sochi 2014 Olympics App) 1. Consistency and standards Every menu system is different and navigating between them is unclear and requires me to navigate to the home page to re-establish my understanding of where I am in the program. 2. Aesthetic and minimalist design From opening the app you are immediately inundated with over 20 available action buttons and the clutter causes confusion as to which specific button causes a desired action. 3. Help and documentation This app provides no documentation of how to operate features or the effect of an action.


Stephanie Ku - 2/12/2014 17:24:46

1) Gulf of execution refers to the gap between the user’s understanding of the tasks and actual interface language of the machine. The more the user needs extensive planning or translation activities to be able to use machine as per his/her intentions, the wider the gulf of execution. To bridge this gulf, we can provide the user with a higher-level language that expresses the tasks in a structural form. It is easiest if the task can be described in the same language used within the task domain. In other words, the user interface should be easy for the user to map their intentions to the expression in the language. Additionally, consistency in design and structure across the user interface will also bridge the gulf of execution.

On the other hand, the gulf of evaluation refers to the gap between the amount of processing structured required to the user determining whether or not the goal in using the machine has been achieved. The more mental workload the user needs to put in to understand the output, the wider the gulf of evaluation. Thus, to bridge this gulf, we need to reduce the user’s mental workload. This can be done by presenting the information to the user in a form that directly fits the term of evaluation. For instance, if we are evaluating the difference in a number, the user is presented with a ‘Difference’ value rather than one ‘Current’ value that the user needs to track over time.

2) Heuristic evaluation could be more beneficial that usability in several ways. Firstly, heuristic evaluation could be better when testing with people who are not as familiar with usability. In traditional user testing, help is rarely given to the user, as the tester wants to see which mistakes the user make. On the other hand, heuristic evaluation allows the test to be more focused around a few issues. With the heuristics in mind, the test is more structured, allowing the tester to have a more educated and structured understanding of the test. It also makes the evaluation process of the test much simpler as the findings could be categorized into its heuristics.

A mobile application (which I thoroughly enjoy) but violate a few heuristics is Plants vs. Zombies (the first one). Firstly, the application violates the aesthetic and minimalist design heuristic. While I understand that they are trying employ skeuomorphism (with the gravestones etc.), the placements of these buttons or texts are not clear. Take for example, the home page. There are several ‘buttons’ that are vague in whether or not they are even buttons. Additionally, it is vague in terms of which button to press to play the game. The wording, e.g. “Adventure”, “More Ways To Play” makes it ambiguous as to which button to press to start the game.

In addition, the application violates the recognition rather than recall heuristic. While it is understandable as there are so many different plants and so many different types of zombies, there is not intuitive way to find out which plants or zombies do what. In order to find out these zombies and plant’s abilities, we have to press the Almanac button then pick Plant or Zombie, then press the picture of the plant/zombie to figure it out. This also violates the flexibility and efficiency of use heuristic. Wouldn’t it be great if I could just click on a zombie or a plant at the beginning to figure out its powers? Rather than having to take 3 extra steps to do so.



Daphne Hsu - 2/12/2014 17:26:48

1) Gulfs of execution is the gap between a user's goal for action, and the steps needed to execute that goal. It is the difference between what the user thinks needs to happen, and the actual actions needed to be taken. Gulfs of evaluation is used to determine whether the goal has been achieved, and how much work it has taken to achieve the goal. Different ways to bridge these gulfs are from the system side, which requires effort from the designer, and from the user side, which requires effort from the user. The designer can work towards helping the user, or the user can try to build competence and understand the designer's goal.

2) Heuristic evaluation can be more beneficial than usability testing when the interface being tested is very complicated. In usability testing, the experimenters don't want to provide too much guidance to the testers than necessary, but this may be troublesome because the testers may waste a lot of time figuring out exactly how to use the interface. In heuristic evaluation, testers are given hints on how to proceed with the interface in order not to waste evaluation time struggling to understand the mechanics of the interface. A mobile application that violates 2 heuristics is Uber, the taxi-like car service. One heuristic it violates is "error prevention". I've tried to order a car from them before, and the request kept timing out and not working after multiple tries. Another heuristic it violates is "help users recognize, diagnose, and recover from errors." When my request timed out, I wasn't presented with any clear error message, so I just assumed it was a timeout error that kept persisting.


Conan Cai - 2/12/2014 17:28:06

The gulf of execution is the process of "how" some action will be carried out. A user has a mental picture of what the result should look like. The user knows "what" he or she wants but doesn't know "how". The machine on the other hand is the opposite. It knows "how" to make something, but it doesn't know "what". The user needs to communicate to the machine and in this process the gulf is bridged. The difficulty of bridging the gap replies on how well a user can communicate. Using human readable instructions as the input to the machine allows the user to more easily verbalize exactly what they want. After all, being able to tell a machine "I want this to be red" is much easier for a person than trying to give they machine the same instructions with a bunch of 1's and 0's. Similarly, the gulf of evaluation shows the user the result. The gulf is the gap between the result of some machine action and how it communicates back to the user. Again, using human readable output bridges the gulf effectively. Machines are much better at translating machine code to human readable text than the other way around. In this case, the user is the "weakest' link and catering to the user by making things in human readable form allows effective communication in both execution and evaluation.

Heuristic evaluation provides a basic checklist that virtually all products or applications should meet. This gives a quick standardized level of usability from which finer adjustments can be further made through usability testing. Heuristics can quickly catch any major design flaws immediately. What I've noticed in some mobile games is a violation of "User Control and Freedom". All too often I will accidentally hit a tutorial button and I need to swipe my way though the tutorial to get back to the main menu rather than having the "emergency escape." Many of these mobile games also violate "Aesthetic". Mobile banner ads can dominate the screen and distract the user. Making these ads intrusive lets the user know exactly why the app was made in the first place - to make money, not for the users' enjoyment.


Diana Lu - 2/12/2014 17:28:15

The gulf of execution is the difference between what the technology allows the user to accomplish with the application and what the user believes he or she is capable of doing. Essentially, the gulf of execution questions whether the application is fulfilling the user's intent to the fullest. The gulf of evaluation refers to the ease of use on the user's part, in which the user must be able to perceive what actions or motions will bring the user closer to what they wish to achieve. One way to bridge the gulf between user and creator is to constantly request feedback from the target demographic without giving the user any prior information as to how to operate the application and what features it includes. Essentially, the designer should leave it entirely to the user to understand the flow of the application. (usability testing)

Heuristic evaluation is testing that allows testing of an application's usability, where an applications components are measured up against a predetermined set of heuristics, whereas usability testing is done with users. Heuristic evaluation can be more useful when the user's needs are not defined as specifically as one might want, or if it is intended to be used by a broad range of people. It might be more effective to measure the application against pre-set ideas rather than the volatility of actual users. An example of a mobile application that violates some heuristics is IFTTT, which essentially allows you to work around your mobile to create links between activities your phone is capable of. However, it lacks minimalistic design and help and documentation. I found while trying to use it that although the application itself was incredibly useful, it was difficult to use without much of an explanation.


Derrick Mar - 2/12/2014 17:28:57

1) In short, the gulf of evaluation is the gap between what the user wants and the means of the system executing that goal. More visually, it is everything between the user's stated action and the bits in the computer needed to execute it. The gulf of evaluation is the difficulty of aligning a system to represent a current state to the user that can be easily interpreted (through the interface). One way to bridge these gulfs, for example in higher level languages, is to make the physical form of vocabulary similar to their actual meanings in the real-world. Consequently, the user will have an easier time associating the the meaning of the commands with what they actually do to the system. Another way to bridge these gulfs is to “bring the user closer to the system” by having that individual learn and adapt to it.

2) I think the major advantage of heuristic evaluation over (traditional) usability testing is the ability to gain more concrete data. Moreover, in heuristic evaluation you are not only having the evaluator identify the usability problem, but also the evaluator must concretely explain why he/she finds it a problem. In traditional user testing, it is up to the the “observer” (the one observing the user test the system) to interpret what problems the user is having.

In my opinion, one poorly designed app is the Android Etrade app. The first heuristic it violates is “help and documentation.” For such a complex app as Etrade, it only has a small FAQ of 10 questions. Moreover, the ability to search through this FAQ's are limited other than scrolling the page. Essentially, it feels like you are scrolling through a bunch of text. Another heuristic that is violated is the “flexibility and efficiency of use” heuristic. If you click the menu button on the top left, you are overwhelmed with about 20 tabs much of which won't make sense to a new user even if that user has financial expertise (e.g. screened, barcode lookup). This leads me to explain the third heuristic violated which is “aesthetic and minimalist design”. Granted that it is a financial application, the screen is totally filled with text and numbers many of which is unnecessary for the user. To give a more clear picture, there is about only 5% “white-space” on average in the app. Thus, the Etrade app violates several heuristics which can make it particularly cumbersome for a novice user especially if they intend the use the for only a few tasks.


Qianyun Li - 2/12/2014 17:29:17

1) The gulf of execution is the gap between user intention and how to instruct the machine to do what they want. The gulf of evaluation is the gap between the the machine's output and user's goal of computation. Users might sometimes need extra manipulations in order to get the output.

There are two ways to bridge these gulfs. One is from the designer side, where they design higher level languages that move towards the user. Second is from the user side, where they try to adapt to the system by changing the way they think and the language they use to better adjust to the system.

2)Heuristic evaluation can be more beneficial because there are multiple evaluators so it's more comprehensive in terms of finding usability problems. Furthermore, since each evaluator conducts the evaluation separately, feedbacks are ensured to be unbiased and independent.

A mobile application I found confusing is "Gallery" and "Photo, two native apps on Samsung Android phones. It violates the heuristics: Recognition rather than recall. It also violates: Consistency and standards. First, both apps are very similar but somewhat different. So I am always confused with which one to use under which condition. It seems to me they are very inconsistent in the language describing what each function do.


Cheng Sima - 2/12/2014 17:31:04

1) Gulfs of execution are the distance between the user's goals/thoughts and the system's commands and mechanisms that has to be used to reach that goal. Gulfs of evaluation are the distance between the user's perceived feedback/output from the system and the system's actual output.

There are two main ways to bridge the gulfs: let the system use higher level language to communicate with the user, or train the user so that he/she will begin to think in the way of the system language.

Using higher level language would involve making the commands as close to normal language commands as possible. The output should also be easily interpreted, for example, real time word or photo editors. We can also provide tutorials to training the new users. For example, teaching how computer science students how to use machine language in 61C.

2) Heuristic evaluation can be more beneficial than usability testing because evaluators have a list of heuristic guidelines to compare with, and we can pool different evaluators' comments to test the design comprehensively, yet focused. In addition, heuristic evaluation can be used to test early-stage prototypes that may not have been actually implemented, since there is a set of heuristics to compare with and the evaluators are not really "using" the system.

An example of an application that violates several heuristics is the NextBus on my phone. Firstly, it violates "flexibility and efficiency of use". My NextBus can never detect my location. However, this is not the problem. The major problem is that after more than 100 times of always choosing AC transit and either 51b or 1, every time I open the app I have to start from "Menu", "Select Specific Stop", "AC Transit" to choosing the route. There is no way for me to speed up this repetitive action. I would recommend the app to be able to remember my most recent inquiry.

Secondly, its design violates "visibility of system status". There are many times when I try to go to another page and I get no response at all. There is no feedback on my action. Most of the times I suspect it is loading because the internet is slow, but I don't receive any status feedback or any icon that shows it is loading. Sometimes I even wonder if the app is frozen, or if my phone is dead. I would recommend that the app show a loading icon when I click to go to another page, so that I won't be confused as to whether I clicked on the right place or if my phone is frozen.


Aman Sufi - 2/12/2014 17:34:14

1. The gulf of execution would be aptly described as the amount of effort a user must put in in order to go from their intention to its execution on a machine. Often the more simple the machine, such as a computer which only accepts binary code input, the more involved a user must be in transcribing their intentions into being executed on the machine.

The gulf of evaluation is the converse to the gulf of execution and refers to the amount of processing structure a user needs in order to determine if the execution of their intention yielded the goal they intended. For example, a camera firmware that only shows the last image in a series of burst images but does not tell the number of images captured or the frame rate until the set is expanded would have a large gulf of execution if the user’s original intention was to take a series of 10 burst shots in one second.

To bridge these gulfs, there are multiple approaches, but they often have tradeoffs. For example, creating higher level languages often make it much easier for a user to solve common problems within a certain domain much easier and without having to implement the low lying features, but at the same time this means that high level languages lose their generality and reduce the freedom of the programmer in expressing their intentions. WYSIWYG programs are also another way of bridging the gulf by making it very apparent to the user how to input what they are thinking of and which instantly display the results of their action to them. It is also a good idea to focus on the most needed uses of a program by a user and specialize the program so it is easy and achievable to meet those needs.

2. One example of this is the camera app. I start with the intention to open a picture and with the app open I want to focus on an area on the screen. I tap the area, bridging the gulf of execution, and the phone displays a circle under my finger that changes in size to give the perception that it is focusing, which takes me from the execution of my task of focusing the image through the gulf of evaluation, where I am able to evaluate that the camera is trying to focus due to the circle expanding and contracting as the image moves in and out of focus. Once it successfully focuses, the circle fades away and my original goal is complete.

As a second cycle, I now want to take a picture as my goal, so I push the button and the camera snaps a picture showing me a shutter close graphic on the screen to signify to me that it has succeeded and then showing the pic going into a small stack of pictures in the corner of the screen. Knowing this, I can go on to the third goal of seeing the picture by tapping on the stack of images and my goal is accomplished when the stack expands to show me the image I just took.


Chirag Mahapatra - 2/12/2014 17:34:33

Gulf of execution: It is described as the gap between a user's goal for an action and the means to execute the goal. This can be shown with the example of recording a television show. Here the user might wish to press a button to record a show. However, in olden day systems like VCR, the user will have to specify time, channel, save the settings and then press "OK" before actual recording begins.

Gulf of evaluation: It can be described as the degree to which system provided results match with the user expected results. This can be shown with the example in the text where the user is attempting to control the rate at which the water level changes in a tank. While the user might want to receive a report on the rate of change of water flow, the user might just receive the water level in the tank. The user will need to measure the water level over time and control the tank.

To bridge these gulfs one will have to measure the semantic and articulatory distances in the user interfaces and try to minimize it.

In heuristic evaluation, the observer helps the evaluator in case of problems and assist them in operating the interface. Compared to this in usability testing, the observer does not help the user more than necessary. Instead the observer has to interpret the actions of the user and make the report. Also heuristic evaluation is a quicker process than usability testing because the observer just has to aggregate the self made comments into one document. Also heuristic evaluation can be done on a piece of paper which makes it extremely advantageous for early stage testing. Compared to this, usability testing is done only in advanced stages of product development.

Google Drive mobile app: Error prevention: The Drive app stops uploading pictures when the user leaves the app. Also the user is not informed when the upload is not successful due to leaving the app. Hence, no error recovery.

Yahoo stocks application: Aesthetic and minimalistic design: The application just has too much information concentrated on a small screen. Unless the user is very cognizant with the current state of the market, it is very difficult to use. This is also true for most financial applications.



Insuk Lee - 2/12/2014 17:47:42

1. The gulf of execution is the gap between a person's goals and the provided interface of the system to accomplish this. The gulf of evaluation, similarly, is the gap between a person's observation and the provided data/output of the system. The gulf of execution is bridged by making the commands and mechanisms of the system match the thoughts and goals of the user. And the gulf of evaluation is bridged by making the output displays present a good conceptual model of the system that is readily perceived, interpreted, and evaluated. The goal in both cases is to minimize cognitive effort. One way to bridge these gulfs is to make a high-level or specialized language to make the semantics of the input and output language match those of the user. This requires more work on the part of the system designer. Another way is for the user to build new mental structures and think in the same way required by the system. This would require more cognitive effort on the user's part.

2. Heuristic evaluation can be more beneficial than usability testing because of the interaction with the evaluators. Just like when you're doing a technical interview it's better to say what you're thinking out loud so that the interviewers can know where you are and understand the way you think, rather than their observing you struggling through a problem, it would answer a lot more questions in terms of why the users do what they do and the things they struggle with when heuristic evaluation is used. Snapchat is an app where I faced a couple of heuristic violations. First, it violates "recognition than recall" because it uses icons and buttons not standardized in the UI industry, making it hard for users to get used to and leading them to make errors. Secondly, it violates "user control and freedom" because there are not many things a user can do in terms of moving between screens - for example, I had a hard time adding a friend because it was not clear how to go back from one page to the page where you add friends.


Romi Phadte - 2/12/2014 17:55:51

The gulf of execution gap is the disparity between a user's goal for action and the means to execute that goal. The gulf of evaluation is the gap in the representations that that a system provides vs the perception and expectations of those representations by the user. Some ways to bridge the gulf of execution gulfs is to give clear UI with intuitive signifiers and appropriate images on buttons to give a clear idea on how to execute goals. Make sure mapping is clear. We can bridge the gulf of evaluation by understanding that flat designs can obscure meaning and making sure all icons are clear and can be perceived as the correct representation.

A mobile app that breaks a couple of heuristics is Lyft. One heuristic it breaks is "Visibility of system status". When I got a coupon code that payed for 25 dollars of every lyft ride, it wasn't clear if the lyft coupon was going to be applied to that specific ride and what the total cost to me was after the application of the coupon. "Help and Documentation" is another heuristic it breaks. It gives no details regarding how tipping works. The app gives a number for customer support however no one picked up my calls. Eventually, I had to tweet to the lyft twitter account and I got the information that I needed. The last heuristic that it violated was "Error prevention". There was one time when I asked for a cab through the app. I later received a call from a lyft driver saying that my location wasn't correctly displayed by the app. This wasn't communicated to the user user of Lyft effectively and the driver had to use other means in order to recover from this error. This isn't ideal. If the app alerted me that the app had difficulty recording my location, then I could be proactive about the problem.


Eric Hong - 2/12/2014 18:20:00

The gulf of execution is the gap between the user's intentions and the actual machine operations to accomplish the desired task. The gulf of evaluation refers to the amount of mental processing required for the user to determine if a goal is achieved. These gulfs can be shortened from either the system side or the user side. The system designer can create user interfaces that more closely match the user's intentions and make it easier for the user to evaluate results. On the other hand, the user can formalize a new mental model to more closely link the desired goals with the capabilities of the device. Heuristic evaluation can be beneficial in providing a set guideline for finding the main usability problems, since it provides 10 concrete heuristics to look for instead of merely observing the user interactions in usability testing. The following is an example of a mobile application that violates the user control and freedom, consistency and standards, and minimalist design heuristics. Imagine a paint application to draw an image from virtual brushes with controllable brush sizes and colors. The application does not have undo and redo buttons to easily fix mistakes, which violates user control. The same button to control brush sizes also control brush color depending on previous user inputs, which violates the consistency heuristic. Finally, information unnecessary to performing the actual task, such as the number of pixels that matches the current brush color, are prominently displayed, which violates the minimalist design concept.


Christopher Schechter - 2/12/2014 21:31:50

1) The gulfs of execution and evaluation relate to how the user translates real-world analogues of actions and information to what they see and do in an app. The gulf of execution is difference between how a user would go about performing an action in real life versus how they would perform the same action in the app in question. The more things a user has to do to perform an action in the app rather than in the physical world, the wider the gulf of execution is. The gulf of evaluation is the difference between how an app presents information to the user and how the information is perceived in the real world.

Both gulfs can be bridged, for example, by making the interface better reflect the physical world. For example, digitally painting in Photoshop using a mouse is very difficult, but when using a pen and tablet similar to real-life drawing tools, it is much easier: this helps cover the gulf of execution. An example of bridging the gulf of evaluation is Microsoft Word--the user can view their document as a piece of paper with the text appearing as it will when printed, which helps them visualize and plan their document accordingly.

2) Heuristic evaluation can be more beneficial than usability testing in that it allows the designers to closely examine the pitfalls of specific pieces of their product. Designers can nail down those targetted areas that fail in order to improve them. Additionally, while testers might ordinarily find surface problems with the app, the further dialogue with the designers that is required will enable the testers to look at different aspects of the app that they may not have evaluated otherwise.

The Facebook iOS app violates a couple heuristics. For one, it lacks some flexibility--you can't message people directly through the app, but rather you need to use the dedicated messaging app. Decoupling the messaging functionality seems like the wrong decision to me, since messaging is a large part of the Facebook experience for many people, and requiring two separate apps makes mobile Facebooking a hastle. It also lacks minimalist design--although the more recent iterations of the app have been better about this, I feel as though the app is still very cluttered. The initial view with the newsfeed visible and a few options across the bottom bar is ok, but as soon as you go into any menu or hit the "More" options button on the bottom, the app overloads you with options. It's almost as if there's too much functionality, and it becomes overwhelming, which is why I'd much rather use the Facebook website on a computer where there is more screen space to support it all.