This lesson is in the early stages of development (Alpha version)

Code-Review

Introduction

Overview

Teaching: 0 min
Exercises: 0 min
Questions
  • How can collaborative construction benefit a software project?

  • How can collaborative construction benefit developers?

  • What are some of the different types of code review?

Objectives
  • Learn how to classify code review along different axes.

  • Remember developers are people too!

Collaborative Construction

Who

Who is performing the review and whose work is being reviewed determines the point of a review. If a senior member is reviewing a pull request from an intern, they may want to provide feedback on style or help the intern understand the larger software system. Conversely, an intern may share a new language feature they just learned in college that the senior developer hadn’t known. Maybe a stakeholder is reviewing user stories to ensure the deliverable will satisfy their needs. The desired audience will affect how a review is performed, what kind of common language is available to a senior developer that a stakeholder may not understand?

What

What is being reviewed can also change what the collaborative construction is called. When reviewing single lines of code as they are written, this is usually called pair programming. Reviewing single PRs or commits is classified as code review. You can also review design documents in project planning. Checking test coverage may be called quality assurance and a final check of a repository prior to release is a repository review.

When

When a review takes place is tightly coupled to what is being reviewed, but not entirely. Consider discussing a change to the underlying data structure in a program. Such a recommendation could be easily considered during design but may be impractical prior to release (especially if the code isn’t modular). Keep in mind where in the project life cycle a review takes place.

Why

Why a review happens can be for a variety of reasons. Maybe they are a rubber stamp you have to do before anything is merged. Maybe you really don’t know the best algorithm or library for an operation and want feedback from a colleague. Or perhaps you are a new contributor to a project and want to learn more about the fundamental systems in place. Usually code reviews take place to ensure changes to code are necessary and correct. An hour spent by a separate developer can save days or weeks of time tracking down a new bug or failing test. This is why large companies require reviews, but reviews can be beneficial to smaller teams that also contain novice developers. Some other benefits for reviewers can include:

Pair Programming and Code Review

With such a vast range of possible topics, we will focus here on Pair Programming and Code Review at the level of a PR. Pair programming is working together, simultaneously with one person typing and the other providing guidance and feedback. Code Reviews of PRs are usually asynchronous and performed to ensure incoming code has a specific purpose and meets quality standards.

Mechanics and Methods

Code review requires two complementary skills: the mechanical actions of performing the review through github (or another version control system) and the soft skills of providing feedback.

It can be helpful to draw a comparison between code review and editing a scientific paper. The mechanics of editing a paper may require learning how to record changes, add comments or adjust references. What you say in those comments and what changes you suggest are part of the soft skills. Perhaps even more important is how you say it.

With software, we sometimes lose sight that there is no right or best answer. For any problem that is non-trivial, multiple solutions are viable and the choice between them requires balancing different trade offs often based on personal experience. There are many build systems for python. You may have a preference based on good or bad experiences you had in a prior project, but that doesn’t make it the correct decision. Again relating to editing a manuscript, you (probably) won’t say something like “you picked the wrong colors in this graph” or “you used this phrase incorrectly”. Even, in cases where factual errors are present, it’s more polite to focus on the writing instead of the writer. Recommendations provided with respect are more likely to be followed or at least considered.

What you decide to bring up in a review is personal, reflecting your background knowledge, project goals, and even your mood during the review. We can’t cover all possible things to review, just think of it as an exercise in copy-editing. You have some knowledge that will help the code get better and you need to share that in a way that will be received positively. We will have some details and guidance later, but try to approach reviews like you would prose, not a mechanical drawing or mathematical proof.

Developers are People Too

While you are writing code, you are ultimately producing instructions for the machine. You can be terse and uncaring because machines don’t have feelings. While you are writing reviews, you are providing feedback to a person. Remember they don’t know everything you do (and vice versa) and that doesn’t reflect on their value as a person, their intelligence or even their capabilities. Maybe a PR was made to meet a deadline and there wasn’t time to go back and refactor. Perhaps you have a decade of experience with python while they picked it up a year ago and are more comfortable with compiled languages. Maybe they are also handling a sick family member at home and are having trouble focusing. So before starting a review, remember to change your audience from computer to people.

Practically, follow these suggestions:

Summary

Collaborative Construction is a heterogeneous topic that is important throughout project lifetimes and has many different goals. The mechanics of reviews are separate from what to review. Think of code reviews like editing a manuscript to get into the right frame of mind and always remember the other developer is a person. Your relationship may last several years or just this PR, but they deserve respect and recognition.

In this module, we will cover pair programming and code review at the PR level. While pair programming is less often utilized, it allows you to experience providing guidance and feedback to a fellow developer without getting caught up with the mechanics of performing a review.

Key Points

  • Reviews benefit a software project by fostering a sense of community, improving overall quality, and spreading knowledge of the codebase between developers.

  • Reviews benefit developers by allowing them to learn new design patterns or language features, distilling knowledge throughout a team, and providing some socialization between members.

  • Code reviews can range from immediate pair programming to post hoc repository reviews and everywhere in between. When developers say code review, they most often mean at the level of an individual pull request.


Pair Programming

Overview

Teaching: 0 min
Exercises: 0 min
Questions
  • What are the roles in pair programming?

  • When is pair programming most helpful?

Objectives
  • Learn about the benefites of pair programming

  • Learn the roles of each member and some hints for successful pairing

Introduction to Pair Programming

Pair programming (a.k.a. pairing) is an increasingly popular technique used in software development. It addresses the “high cost, low payback” feeling of being a code reviewer by having two developers work together actively, to write a feature and its tests.

When pair programming, there are two roles:

Periodically, at reasonable points, the Driver and Navigator switch roles.

It is true that pair programming takes more time to write a given programming. However, “two people doing the work of one person” is a completely wrong assessment. Several studies have shown only a ~15% increase to development time with pairing, after participants became comfortable with the approach. But, if pairing saves sufficient costs in other parts of the development lifecycle (e.g. debugging, maintenance) then it is worth it.

As with code reviews, there are several significant benefits of pair programming that far outweigh this increase in development time, which we will talk about next.

Pair Programming and Defects

Pair programming does an excellent job of improving code quality, with one study showing that pair-programming produced code with 15% fewer defects than solo-programming.

This is typically because:

What is Rubber-duck debugging?

Originating from a story in the 1999 book, The Pragmatic Programmer, “rubber-ducking” is the act of describing code line-by-line to something, sentient or otherwise (a rubber duck in the original story), while debugging code. While it may sound ridiculous, this phenomena has been found to increase the detection of bugs or unintended behavior. While pair-programming, the driver will explain the code line by line while substituting the rubber-duck for the navigator, experiencing a similar effect.

Assuming that pair programming is generally this effective, the time-savings in future debugging and maintenance costs more than compensates for the slight increase in development costs.

Remember, the longer a defect is undetected, the more costly it is to fix. Pair-programming tends to detect and resolve defects at the earliest possible moment, before code is even included in a commit to a repository.

Improved Satisfaction and Team Dynamics

Pair programming has been found to dramatically increase individual satisfaction. Collaborating with someone on a shared task is far more enjoyable for most people, and allows for the pair to celebrate success, mourn setbacks, and generally build camaraderie. One study consistently showed >90% of participants preferred pair programming over solo programming (Williams 2000).

Another added benefit of pair-programming is that team members learn to work with each other more effectively. By giving consistent practice in this area, pair programming forces people to work through personality differences and different working styles.

Pair programming also supports the distribution of knowledge and skills more broadly throughout the team. By pairing junior and senior developers together, opportunities for mentorship are created. Including teammates who are unfamiliar with a part of the code will teach them how it works, and broaden the shared knowledge throughout the team.

Improved Productivity

Teams that practice pair-programming frequently report increased productivity. Switching regularly between the Driver and Navigator roles keeps developers fresher and more engaged throughout the day.

For example, if one developer is distracted by some external interruption, the other person can get them back up to speed on the task more quickly.

Even if you don’t utilize pair programming daily, it can be a nice break from solo programming and even a little fun! Pair programming is especially useful in events like hackathons, where there may be more people than action items to do. Everyone will contribute to the project without stepping on toes and keeps the team fresh and motivated by swapping roles.

Issues with Pair Programming

Pair programming requires active focus, communication and participation from both programmers. Long periods of silence is a warning sign of something wrong.

Pair programming simply won’t work well if:

Other practices to avoid (in a larger team setting):

Also remember, that pair-programming is not a fix-all solution. Not every part of a program warrants pairing. For example, two experts working on simple, well-understood tasks will likely work faster individually than if they are pairing. Pairing achieves the greatest benefits when:

Senior developers should also pair with junior developers in order to provide mentoring. It is important for the senior developer to be patient and work at the junior developer’s pace and to solicit feedback from the junior developer on their code.
Remember everybody makes mistakes!

Guidance for Drivers

  • Spend time up-front discussing possible design approaches with Navigator, assuming you have not already discussed various design approaches with your team.
  • Make sure you are “thinking out loud” while programming.
    • Describe, at a reasonably coarse level of detail, how you are solving the problem you are trying to solve.
    • After implementing anything significant (e.g. a function, a relatively complex loop or chunk of code), ask for input!
      • You can ask things like “Does this look right?” or “Am I missing anything?”
  • After writing code, always spend time brainstorming how to test it
    • Even if tests are not written while pairing, make notes about how to test
    • Recall: Test code should be as important as product code, and should be designed and maintained just as carefully
  • “Give up the steering wheel” at suitable points!

Guidance for Navigators

  • If you see a possible issue, mention it!
    • Don’t quietly assume that the Driver is aware of an issue you might see
    • They may be focused on some other part of the problem
  • Consider whether the code’s control-flow handles all possible scenarios correctly
  • Consider whether the code fully satisfies the requirements – and also whether you and the Driver fully understand the requirements
  • Feel free to suggest structural/style improvements, as well as possible bugs
    • If part of the code is complex enough to warrant a comment, point it out
    • Don’t be too nit-picky about unimportant personal variations in coding style
  • If any part of the code makes assumptions about inputs or internal state:
    • Make sure they are documented!
    • Make sure they are enforced, through assertions, sanity-checks, etc.
  • Give Drivers a chance to fix typos before immediately calling them out. Typing with an audience can be difficult at first!

Tools for Remote (or local) Pair Programming

  • VS Code
    • Live Share extension from Microsoft
  • PyCharm (and other JetBrains IDEs)
    • Code With Me plug-in
  • Generic
    • Zoom (or equivalent) shared desktop. Use the annotate feature to highlight areas
  • Have line numbers enabled to assist with dialog

Prepare for your journey

The names Driver and Navigator invoke the image of a road trip with two people. This visual can be helpful in remembering individual duties.

The Driver is in control of the car. They can make a decision to stop if they see a gas station or decide what kind of music to play, but they shouldn’t be looking at maps or changing radio stations! They need to stay focused on the car and traffic.

The Navigator, primarily, assists the Driver. They can make longer term plans, focus on details the Driver doesn’t have the bandwidth to consider, and can tell the Driver if they didn’t notice a speed limit change. As the Driver gets tired the two people swap jobs so they can keep moving

You may have experienced car trips where the roles were less defined and the Navigator was back seat driving. Yelling at the Driver that they need to slow down because there is a police car up ahead can make everyone stressed. As the Driver, you can’t just ignore the Navigator and go your own way. You need to work together to quickly, and safely arrive at your destination.

Key Points

  • Pairs consist of a Driver, who does the typing, and Navigator, who directs the Driver and provides assistance when needed.

  • Pair programming is most beneficial if the task is complex or programmers are unfamiliar with the code.


Pair Programming Activity

Overview

Teaching: 0 min
Exercises: 0 min
Questions
  • What are some benefits and challenges with pair programming?

Objectives
  • Experience pair programming

Setup

Task

You have a complaint from users: The inkjet printer leaves puddles of ink in the dark areas!

Your task is to:

You should now choose who will Drive, who will Navigate, and begin working. About half-way through, you can swap roles.

Remember to communicate! What possible approaches could you take?

This process is about the pair programming process, not the code! If you have extra time you can polish and refactor other parts of the code.

Key Points

  • Pair programming is a skill and practice will improve your experience as a driver and navigator.


Code Reviews

Overview

Teaching: 0 min
Exercises: 0 min
Questions
  • What are some benefits of formalized code reviews?

  • What is the difference between a code read-through and walk-through?

  • What is your biggest hurdle to starting code reviews?

Objectives
  • Learn why code reviews are important.

  • Identify and address challenges when beginning code review.

Code Reviews

Code reviews, or peer (code) reviews are a practice where someone other than the code’s author reads and reviews the code. The author is often involved in the review, and answers any questions that the reviewer has.

These are sometimes formal inspections of a portion of the code and may be part of an acceptance process that the project requires (e.g. for safety-critical software). Code reviews tend to be a more heavy-weight process with specific criteria to evaluate.

More commonly, code-reviews are change-based code reviews, e.g. “Every new feature or bug-fix must be reviewed before it is incorporated into the main-line of development.”

This is the process that is adopted by most large software companies, and tends to be lighter-weight, akin to a “code walk-through” or “read-through.”

Benefits of Code Reviews

Benefits of formalized code reviews (inspections) are very well documented.

From Code Complete, 2ed. pp.480-481

While less-formal code walk-throughs/read-throughs vary much more widely in their benefits, they can still achieve many of the knowledge-sharing, team-building and defect-detection benefits, but not as reliably as formal reviews.

The main differences are:

Taking care not to lose the rigor of the review process is essential to achieving all the benefits to be had. As usual with life, you get out of it what you put into it.

Challenges of Code Reviews

From discussions with other RSEs, there is a realization that code reviews are beneficial but difficult to implement. Maybe you are sold and want to start adding code reviews to your PRs. You are able and willing to review, but you need someone else to review your code! If you work in a small (perhaps solo) development team, who will review your code and how can you help them get started?

I work alone or with researchers/novice developers

There is a certain asymmetry with code reviews when you are the expert and decide to add in code reviews. You can review PRs from others but not your own work! It seems intractable to tell someone else, hey you have to look at my code and catch all the bugs.

However, you don’t need someone else to review your code, you can perform self reviews. Similar to copy editing, you (hopefully) look over your own writing before submitting it for grading or your PI. Self reviews are usually not better than peer reviews, but are vastly better than nothing. For best results, let the PR sit for a day or two before performing your review. That will give you fresh eyes.

Also, don’t underestimate novice developers or even non coders to review code. If you have a stakeholder that’s willing to click a few buttons on github, they may catch logical or domain specific errors. You should strive to have your code readable enough that a novice can follow the algorithm!

I struggle with all the tools!

Remember the separation of mechanics and methods. The method of code review is the act of critically evaluating a section of code and providing feedback in a courteous way. We have some practice doing this with pair programming. Here we will dig into the mechanics: what are you reviewing and where do you write your comments. The practice of evaluating code is much more difficult than knowing which buttons to press on github!

My project requires a lot of domain knowledge, I don’t think anyone else can give me feedback

Even if someone doesn’t understand every line of your code, they can still provide a different view of the code. Maybe your variable names are unintelligible to someone without your domain knowledge and they don’t have to be. Maybe you forgot to include a license or your functions are too long or complex. None of these require deep knowledge of your domain and can be helpful.

Also note that if your code can’t be understood by anyone but you, it may require some refactoring to improve readability. Open source software has to be approachable to would-be contributors. A non-expert reviewer can help identify places a general developer will have trouble following.

Key Points

  • Code review saves time, money and reduces error. The more formal the review, the bigger the savings.

  • During a code walk-through, the author is present to discuss the changes. A read-through does not require the author during the review.


Best Practices

Overview

Teaching: 0 min
Exercises: 0 min
Questions
  • How fast should you review code?

  • How do you decide what should be reviewed?

Objectives
  • Learn about best practices in code review

  • Understand the importance of static analysis tools

Best Practices for Code Reviews

In this episode, we will learn about the best practices of code-reviews regarding scope, goals, feedback, and tools.

Scope

It can be tempting to set lofty goals for code reviews, but in reality, people just can’t concentrate too deeply or for too long!

Widely-recommended practices are to:

Reviewing more code than recommended here can lead to fatigue and the temptation to just say “looks good to me”. Code review requires concentrated, thoughtful feedback.

Goals

Reviewers should have a clear sense of what feedback is being solicited. Some companies have general guidelines for what code-reviewers should be looking for in the code they review

If you are seeking specific kinds of feedback, communicate that to your reviewers! In the absence of this, projects should include general code-review goals and guidelines for reviewers to reference

If there is other background knowledge necessary for an effective code-review, provide it to the reviewers. This can be included or referenced in comments, provided in documentation, etc.

A Checklist from Google

Google’s engineering practices recommend code reviews should look at:

  • Design: Is the code well-designed and appropriate for your system?
  • Functionality: Does the code behave as the author likely intended? Is the way the code behaves good for its users?
  • Tests: Does the code have correct and well-designed automated tests?
  • Naming: Did developer choose clear names for variables, classes, methods, etc.?
  • Comments: Are the comments clear and useful?
  • Style: Does the code follow our style guides?
  • Documentation: Did the developer also update relevant documentation?

Feedback

The highest priority should be placed on providing courteous and respectful feedback on another person’s work. This is a critical skill to develop, and should never be compromised, even in jest.

When providing feedback:

Programmers are often highly opinionated. Creating software feels personal; you are problem solving and translating that in the best way you can. Receiving criticism can seem like someone challenging how you think! Remember, that there are many good ways to solve any given problem – including ways that you personally wouldn’t use: different ways of naming variables, writing loops, structuring functions, etc. When providing feedback, indicate whether your feedback is about a serious issue (e.g. correctness, security, safety), versus merely you expressing an opinion or suggestion.

A Solution for Nitpicks

Several companies specify that review feedback should be prefixed with “Nit: …” if you are merely being nitpicky.

Reviewer opinions and nitpicks are free to be ignored by the author!

Tools

Some goals of code reviews are to check coding style, and to identify bugs, anti-patterns and language abuses. The good news is that there are static analysis tools to help supplement the code-review process and remove some of the more laborious tasks such as:

These tools eliminate a whole range of issues that would otherwise need to be addressed in the code review. Where possible, codify your project style and set up CI/CD or precommit to automatically check or correct any issues prior to code review. These are similar to using a built-in spell checker, you don’t want to waste time looking up spelling in a dictionary! Static analysis tools allow the review to focus on higher-level concerns - how the code works, how it benefits the project, higher-level structural/design issues, etc.

Examples of Static Analysis Tools for Python

  • Flake8 - coding style and linting
  • Pylint - coding style and linting
  • Black - coding style enforcement
  • mypy - check Python code against type annotations
  • ruff - a fast linter and formatter

Additional Resources

These three resources are incredibly valuable and should be read by everyone:

These are also excellent articles:

Key Points

  • Review no faster than 500 lines per hour

  • Decide on a checklist for a project and use it during a review


Code Review Activity

Overview

Teaching: 0 min
Exercises: 0 min
Questions
  • What challenges and opportunities did you experience during the exercise?

Objectives
  • Learn how to review code in github

Code Review in Github

You don’t have to perform code reviews in github, but they are nicely integrated and widely used so it is helpful to practice there. Ideally, your project would develop in the following cycle:

  1. A user requests a feature or notices a bug. They open an issue. Each issue should contain a single problem or feature and be self contained.
  2. A developer works on the issue, implementing the feature or fixing the bug. The submit a pull request consisting of (possibly several) small commits. Commits should be atomic, e.g. can be selected and applied in any order.
  3. A maintainer or other developer reviews the pull request. They should ensure the PR addresses the issue (and only the issue) while maintaining a high quality of code in the project.

Obviously, life would be easy if this were always true. If you are the only developer you would have to perform all these operations. As you implement a feature, maybe you decide to refactor a method at the same time. When you are rushed, you may be tempted to merge a PR immediately, foregoing the review. Try to resist these urges, especially as you are forming your code review habit.

You can also review code outside of github. Maybe a coworker calls you over and you look at a function implementation on their screen and give some quick feedback. Or you edit an answer on stack overflow to improve it. Maybe a collaborator emails a notebook and you recommend some functions to reduce duplication in their cells. These are all valuable and good practice of the method of critically reviewing other people’s code. However, here we will focus on the mechanics of providing feedback on a PR in github.

We will be using the excellent practice materials from code-review.org.

  1. Setup the tutorial on github. Create the text and python exercises.
  2. Look over the issue for “text: Exercise 1” and the corresponding recipe.
  3. Open the PR for “Exercise 1 text.”
    • What do you think of their commits? The recommended fixes? Anything to add or change?
  4. Click on the “files changed” tab and start a review
  5. When you have finished your review, swap reviews with someone else and review their reviews! Focus on the content, tone and helpfulness. Discuss what you liked or they could have done better.
  6. Repeat the above with python Exercise 1.
    • Now you can focus on code quality a bit more. Use what you’ve learned here and in other modules to improve the PR.
    • Before you get carried away with recommendations, what are the ideal next steps for recommending larger changes?
  7. (Optional) if you have time, review another PR from the tutorial repo, a PR from another participant, or a PR in a project you like.

Key Points

  • Strive to review PRs that solve a single issue.


Collaborative Construction Debrief

Overview

Teaching: 0 min
Exercises: 0 min
Questions
Objectives
  • Review the exercises and discuss any challenges or opportunities you noticed

Code Review

Code review adds a social aspect to the typically solitary work of development. Giving feedback to people can be challenging after working on software for a while, remember to change your audience and mentality. Receiving feedback can also be difficult; reviews can feel like they are attacking how your think instead of how you code.

Still, code review greatly improves projects, saving time and money in substantial amounts. When starting, remember that self reviews are useful practice and can improve PRs prior to merging. Talking and sharing your expertise can be fun!

Key Points