Skip to main content

Tests

Tests are extremely important to the health and stability of Streetmix. We have established some systems and processes to help ensure the ongoing reliability of our platform.

We do not have a strict test-driven development (TDD) methodology, although individual engineers may use this approach if that’s the development pattern they are most comfortable with. Also, while we do measure code coverage, our goal is not necessarily to reach 100%. We’re looking for “enough” coverage to have confidence that new features or refactoring will not create new bugs, which can be more of a subjective approach. As Guillermo Rauch says, “Write tests. Not too many. Mostly integration.”

For context...

We did not have any test infrastructure in the early phases of Streetmix. Tests have been added over time and are constantly improving. This document reflects our current thoughts about how we should test, but you’ll find lots of moments in the codebase where tests are incomplete or non-existent. We could always use some help with writing tests!

Running tests locally

When testing in a local development environment, only linting and unit tests are run.

npm test

Full integration tests happen in our continuous integration infrastructure. You’re not required to run this locally, but if you’d like, you can do so with this command.

npm cypress:run

Unit and integration tests

Our primary test framework is the Jest test runner with React Testing Library (RTL). (These do not do the same thing and are not interchangeable; these two systems work closely together to provide a full unit and integration test environment.) See the list of resources below, which fully document why and how we use these.

Our goal is to be as close as possible to “industry best practice” in order to simplify our understanding and comprehension of tests. Please do not do anything exotic in these tests.

Front-end unit and integration tests are placed in a __tests__ folder in the same directory as the module being tested. Test modules should be named filename.test.js.

Integration tests are preferred over unit tests.

Whenever you’re writing boilerplate tests, consider writing an integration test rather than an unit test. A good way to know when you should introduce an integration test is when you need to mock a lot of modules in order to test something in isolation. This is typically a sign of side effects, and we’ll have greater confidence with an integration test.

This is especially the case with Redux-related actions. You shouldn’t need to write a test to see if the Redux store has changed. (We already have standard, boilerplate unit tests for the store itself.) Instead, write an integration test, where instead of testing the action in isolation, test for the end result of the action.

Component testing

Many of our React components use Redux and react-intl, which are required in the component’s context to render properly. If a component has either (or both) in its context, use the helper functions in ./test/helpers/ which wrap React Testing Library’s render() with mock <Provider /> and <IntlProvider /> components.

Snapshot testing

Snapshots should be used with caution. They tend to break, and developers tend to update them without examining why a snapshot might have changed unexpectedly. Please be careful when snapshots fail and when to add new ones.

Snapshots are good when you test different results, like error messages. Take a look at ./assets/scripts/app/__tests__/StreetEditable.test.js for an example how snapshots can be used with error messages.

Mocks

Be aware of mocks. A few files and functions are mocked globally. For example load_resources is mocked globally and if you need to use that file in your tests/components be sure to check the mock. Otherwise use jest.mock to mock modules, classes, etc.

Resources

End-to-end tests

We use Cypress, a modern framework for end-to-end testing, to make writing and running our end-to-end tests easier. We currently use it sparingly. We do eventually want more tests to exist in Cypress, when appropriate, and can replace the unit or integration tests that end-to-end tests can cover.

Cypress only runs in our automated continuous integration test environment by default, but can also be run locally:

npm run cypress:run

Linting

We use ESLint and Stylelint to lint JavaScript and CSS, respectively. There is a commit hook that automatically runs the linter on each commit. If the lint fails, you will need to fix your code and try your commit again, or force it to ignore the lint errors. For more information, see the Code styleguide.

We also use Prettier to automatically format code in a standardized way. It will only run on changed files.

Type safety

JavaScript is notoriously not type safe: you may pass any type of object or JavaScript primitive to any function or method, which may not be able to handle them. Or you may write a function that returns values of different types, and the calling script wasn’t expecting that return value. Various attempts to introduce type safety on top of JavaScript have entered the ecosystem, and here’s how we use these tools.

PropTypes (React)

PropTypes is a runtime typechecking library used for React development. Because it is a runtime checker, PropTypes will only throw errors in the console when running in the browser or in test suites. (The PropTypes library is not compiled into production code.)

We currently enforce using PropTypes for React components in development. This means that React components must declare all of its props and what types of values that prop should be. The benefit of this approach is that React components self-document what props it accepts. Sometimes, a prop can be overloaded with multiple types, but this is generally discouraged if you can avoid it.

TypeScript

TypeScript is an extension of the JavaScript language that allows types to be checked statically (that is, reason about whether the right types are being passed around, without having to run the code itself). It’s been growing steadily in popularity over the past few years.

We have experimented with TypeScript, but we’ve not fully adopted it into Streetmix. Because we already compile code with Babel, adopting TypeScript piecemeal is doable. However, we have not yet run into a situation where we absolutely need TypeScript. That being said, if and when a good case can be made for adopting it, we will likely jump on board. If a migration to TypeScript occurs in React components, it will supercede using PropTypes.

Device and browser testing

We do not currently implement device or browser testing, but this is on our to-do list. We have a Browserstack account for this purpose.

Continuous integration (CI)

We use GitHub Actions to automatically run tests for every commit and pull request to our repository.

Skipping CI

CI can be skipped by appending [skip ci] to a commit message.

Automatic deployment

Every commit or merged pull request to the main branch that passes CI is automatically deployed to the staging server.

Currently, there is no automatic deployment to the production server. We’ve noticed that each deploy introduces a small amount of lag while the server software restarts. As a result, we now manually trigger deployments to the production server.

GitHub checks

In addition to continuous integration, we use some third-party services to keep an eye on code quality and test coverage. These services should be considered “code smell” detectors, but treat them with a grain of salt. They are not required to pass before merging pull requests.

CodeClimate

CodeClimate measures technical debt, or the long-term maintainability and readability of code. It applies some heuristics to detect and track “code smells,” which are opportunities to refactor code or fix potential bugs. A CodeClimate review is triggered automatically on every pull request, but some of the thresholds it uses are quite arbitrary. Here’s some of the issues are raised, and how we’d address them, in order of increasing severity (as it applies to Streetmix):

  • Lines of code. CodeClimate triggers a warning when functions and modules exceed an arbitrary line limit. This means there is a potential opportunity to separate concerns, but we will never enforce this, since we don’t want to encourage “code golf” or quick workarounds instead of actually taking the time to separate logic. If something can be refactored into smaller pieces, but can’t be prioritized immediately, add a TODO comment instead. If something doesn’t make sense to shorten, mark the issue as “Wontfix”.
  • Duplicate code. CodeClimate triggers a warning when it detects code that look the same as other code elsewhere. This can be an opportunity to refactor code, but more often than not, CodeClimate is seeing similar-looking boilerplate code or patterns. In this case, mark the issue as “Invalid”.
  • Cognitive complexity. CodeClimate triggers a warning when a function contains too many conditional statements, resulting in complex branching or looping code. Not all code can be made simpler, but you may want to consider whether it can be written diffferently. However, use your best judgment here. If you don’t agree with CodeClimate’s assessment, mark the issue as “Wontfix”.
  • TODOs. CodeClimate tracks when a TODO or a FIXME comment is written in the code. Because this is a developer’s own judgment call, this takes priority above other issues and should be addressed in the future. Never mark this as “Wontfix” or “Invalid”. If it’s no longer valid, instead remove the TODO or FIXME comment from the code.

Issues that should be addressed in the future, but can’t or won’t be addressed immediately, should be marked with “Confirmed.”

In spite of CodeClimate’s warnings, reviewers may approve its review even if the issues it raises are not addressed right away.

Codecov

Codecov measures code coverage, which is the percentage of code that is covered by at least one test suite. This percentage is a commonly used metric that software projects use to show how complete its test suites are. However, the percentage itself is not necessarily a measurement of test quality. As a result, while we strive for higher coverage, 100% is not the goal.

A Codecov review is triggered automatically on every pull request, which allows a reviewer to see at a glance whether a pull request increases or decreases overall code coverage. It fails if a large amount of new code is added without increasing a corresponding amount of test coverage.

Because our test suite coverage is quite low at the moment, it is preferred that all new code and refactored code come with test suite coverage.

Resources

These additional resources from the developer community help guide our approach to testing. This is not an exhaustive list, and we’ll keep updating this over time.