Developer Blog

High-velocity Software Development Using Nested Git Branches

Implementing several different unrelated code changes at the same time in the same feature branch is like trying to have a conversation about several completely different topics at the same time with the same person. Is never productive. We end up mixing up issues or forgetting to think about important edge cases of one topic because we are distracted with the other topics. We only have parts of our brain available to think about each topic. Dealing with several issues at the same time might create the illusion of multitasking and productivity, but in the end, it is faster, easier, safer, cleaner, and less error-prone to take on each topic separately.

This blog post describes a technique for highly focused development by implementing code changes using a series of nested Git branches. We use specialized tooling (Git Town) to make this type of working easy and efficient.

Example

As an example, let’s say we want to implement a new feature for an existing product. But to make such a complex change, we have to get the code base ready for it first: – clean up some technical drift by improving the names of classes and functions that don’t make sense anymore – add some flexibility to the architecture so that the new feature can be built with less hackery – while looking through the code base, we also found a few typos we want to fix

Let’s implement these things as a chain of independent but connected feature branches! The tables below shows the Git Town commands – as well as the individual Git commands you would have to run without Git Town – to achieve that. First, let’s fix those typos because that’s the easiest change and there is no reason to keep looking at them.

Fix typos

We create a feature branch named 1-fix-typos to contain the typo fixes from the master branch:

Git Town command Git commands description
git hack 1-fix-typos git checkout master we want to build on top of
the latest version of master
git pull
git checkout -b 1-fix-typos

We do a few commits fixing typos, and submit a pull request:

1
git new-pull-request

This opens a browser window to create the pull request on your code hosting service. All of this only took us 30 seconds. While the code review for those change starts taking place, we move on to fix the technical drift.

Rename foo

We don’t want to look at the typos we just fixed again, so let’s perform any further changes on top of branch 1-fix-typos:

Git Town command Git commands
git append 2-rename-foo git checkout -b 2-rename-foo

git append creates a new feature branch by cutting it from the current branch, resulting in this branch hierarchy:

1
2
3
4
5
master
  \
   1-fix-typos
     \
      2-rename-foo

Now we commit the changes that rename the foo variable and start the next pull request:

1
git new-pull-request

Because we use git append to create the new branch, Git Town knows about the branch hierarchy and creates a pull request from branch 2-rename-foo against branch 1-fix-typos. This guarantees that the pull request only shows changes made in branch 2, but not the changes made in branch 1.

Rename bar

This is a different change than renaming foo, so let’s do it in a different branch. Some of these changes might happen in the same places where we just renamed foo. We don’t want to have to deal with merge conflicts later. Those are boring and risky at the same time. So let’s make these changes on top of the changes we made in step 2:

Git Town command Git commands
git append 3-rename-bar git checkout -b 3-rename-bar

We end up with this branch hierarchy:

1
2
3
4
5
6
7
master
  \
   1-fix-typos
     \
      2-rename-foo
        \
         3-rename-bar

Fix more typos

While renaming bar, we stumbled on a few more typos. Let’s add them to the first branch.

1
2
3
git checkout 1-fix-typos
# make the changes and commit them here
git checkout 3-rename-bar

Back on branch 3-rename-bar, the freshly fixed typos are visible again because the commit to fix them only exists in branch 1-fix-typos right now. Luckily, Git Town can propagate these changes through all other branches automatically!

Git Town command Git commands
git sync git checkout -b 2-rename-foo
git merge 1-fix-typos
git push
git checkout 3-rename-bar
git merge 2-rename-foo
git push

Generalize the infrastructure

Okay, where were we? Right! With things properly named it is now easier to make sense of larger changes. We cut branch 4-generalize-infrastructure and perform the refactor in it. It has to be a child branch of 3-rename-bar, since the improved variable naming done before will make the larger changes we are about to do now more intuitive.

Git Town command Git commands
git append 4-generalize-infrastructure git checkout -b 4-generalize-infrastructure

Lots of coding and committing into this branch to generalize things. Since that’s all we do here, and nothing else, it’s pretty straightforward to get through it, though. Off goes the code review for those changes.

Shipping the typo fixes

In the meantime, we got the approval for the typo fixes in step 1. Let’s ship them!

Git Town command Git commands description
git ship 1-fix-typos git stash -u move things we are currently working on out of the way
git checkout master update master so that we ship our changes on top of the most current state of the code base
git pull
git checkout 1-fix-typos
git pull make sure the local machine has all the changes made in the 1-fix-typos branch
git merge master resolve any merge conflicts between our feature and the latest master now, on the feature branch
git checkout master
git merge –squash 1-fix-typos use a squash merge to get rid of all temporary commits and merges on the branch
git push
git branch -d 1-fix-typo delete the branch from the local machine
git push origin :1-fix-typo delete the branch from the remote repository
git checkout 4-generalize-infrastructure return to the branch we were working on
git stash pop restore what we were working on

With branch 1-fix-typos shipped, our branch hierarchy looks like this now:

1
2
3
4
5
6
7
master
  \
   2-rename-foo
     \
      3-rename-bar
        \
         4-generalize-infrastructure

Synchronizing our work with the rest of the world

We have been at it for a while. Other developers on the team have shipped things too, and technically the branch 2-rename-foo still points to the previous commit on master. We don’t want our branches to deviate too much from the master branch, since that can lead to more severe merge conflicts later. Let’s get everything in sync!

Git Town command Git commands description
git sync git stash -u move things we currently work on out of the way
git checkout master
git pull
git checkout 2-rename-foo
git merge master
git push
git checkout 3-rename-bar
git merge 2-rename-foo
git push
git checkout 4-generalize-infrastructure
git merge 3-rename-bar
git push
git stash pop restore what we were working on

Because we used git append to create the new branches, Git Town knows which branch is a child of which other branch, and can do the merges in the right order.

Build the new feature

Back to business. With the new generalized infrastructure in place, we can now add the new feature in a clean way. To build the new feature on top of the new infrastructure:

Git Town command Git commands
git append 5-add-feature git checkout -b 5-add-feature

Let’s stop here. Hopefully, it is clear how to work more focused using Git Town. Let’s review: – each change happens in its own feature branch – run git append to create a new feature branch – run git sync to keep all feature branches in sync with the rest of the world – do this several times a day – run git ship to ship a branch once it is approved

Advantages of focused feature branches

Working this way has a number of important advantages:

  • Focused changes are easier and faster to create: if you change just one thing, you can do it quickly, make sure it makes sense, and move on to the next issue in another branch. No more getting stuck not knowing which of the many changes you did in the last 10 minutes broke the build, and no need to fire up the debugger to resolve this mess.
  • They are easier and faster to review: The PR can have a simple description to summarize it. Reviewers can easily wrap their heads around what the PR is doing and make sure the changes are correct and complete. This is also true if you write code just by yourself. You should give it a final review and cleanup before merging it into master!
  • Branches containing focused changes cause less merge conflicts than branches with many changes in them. They make fewer changes, and the changes are typically more distributed across lines. This gives Git more opportunity to resolve merge issues automatically.
  • In case you have to resolve merge conflicts manually, they are also easier and safer to resolve. You know exactly what single change both branches perform, versus what five changes each of the branches perform in the unfocused scenario.
  • Others can start reviewing parts of your changes sooner because you start submitting pull requests earlier.

Ultimately, using this technique you will get more work done faster. You have more fun because there is a lot less getting stuck, spinning wheels, and starting over. Working this way requires running a lot more Git commands, but with Git Town this is a complete non-issue since it automates this repetition for you.

Best Practices

To fully leverage this technique, all you have to do is follow a few simple rules:

postpone ideas: when you work on something and run across an idea for another change, resist the urge to do it right away. Instead, write down the change you want to do (I use a sheet of paper or a simple text file), finish the change you are working on right now, and then perform the new change in a new branch a few minutes later. If you can’t wait at all, commit your open changes, create the next branch, perform the new changes there, then return to the previous branch and finish the work there. If you somehow ended up doing several changes at once in a branch anyway, you can still refactor your Git branches so that they end up containing only one change.

go with one chain of branches: When in doubt whether changes depend on previous changes, and might or might not cause merge conflicts later, just work in child branches. It has almost no side effects, except that you have to ship the ancestor branches first. If your branches are focused, you will get very fast reviews, be able to ship them quickly, and they won’t accumulate.

do large refactorings first: In our example we did the refactor relatively late in the chain because it wasn’t that substantial. Large refactorings that touch a lot of files have the biggest potential for merge conflicts with changes from other people, though. You don’t want them hanging around for too long, but get them shipped as quickly as you can. You can use git prepend to insert feature branches before the currently checked out feature branch. If you already have a long chain of unreviewed feature branches, try to insert the large refactor into the beginning of your chain, so that it can be reviewed and shipped as quickly as possible:

1
2
git checkout 2-rename-foo
git prepend 1-other-large-refactor

This leads to the following branch hierarchy:

1
2
3
4
5
6
7
8
9
master
  \
   1-other-large-refactor
     \
      2-rename-foo
        \
         3-rename-bar
           \
            4-generalize-infrastructure

The new large refactor is at the front of the line, can be shipped right when it is reviewed, and our other changes are now built on top of it.

Happy hacking!

Continuous Delivery to DC/OS From CircleCI

At Originate we use CircleCI for all of our Continous Integration & Delivery needs. We’ve recently started to use DC/OS to deploy some of our software. This blog post will walk through the steps required to deploy a simple Marathon web service to DC/OS from CircleCI.

We’ll start with a very basic circle.yml and build it up as we go.

circle.ymllink
1
2
3
test:
  override:
    - echo "You'd normally run your tests here"

The best way to interact programmatically with DC/OS is through the CLI tool provided by Mesosphere. Let’s add it to our build container.

circle.ymllink
1
2
3
4
5
6
7
8
9
10
11
12
13
14
machine:
  environment:
    PATH: $HOME/.cache/bin:$PATH   # <- Add our cached binaries to the $PATH

dependencies:
  override:
    - scripts/ci/install-dcos-cli  # <- Install the DC/OS CLI

  cache_directories:
    - "~/.cache"                   # <- Cache binaries between builds to speed things up

test:
  override:
    - echo "You'd normally run your tests here"
scripts/ci/install-dcos-clilink
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#!/usr/bin/env bash

set -euo pipefail

if [[ ! -z ${VERBOSE+x} ]]; then
  set -x
fi

BINS="$HOME/.cache/bin"

DCOS_CLI="$BINS/dcos"
DCOS_CLI_VERSION="0.4.16"

# Returns the version of the currently installed DC/OS CLI binary. e.g 0.4.16
installed_version() {
  dcos --version | grep 'dcoscli.version' | cut -d= -f2
}

# Downloads the DC/OS CLI binary for linux with version $DCOS_CLI_VERSION to the cache
install_dcos_cli() {
  mkdir -p "$BINS"
  curl -sSL "https://downloads.dcos.io/binaries/cli/linux/x86-64/$DCOS_CLI_VERSION/dcos" \
    -o "$DCOS_CLI"
  chmod u+x "$DCOS_CLI"
}

# Install the DC/OS CLI if it's missing. If it's present, upgrade it if needed otherwise do nothing
if [ ! -e "$DCOS_CLI" ]; then
  echo "DC/OS CLI not found. Installing"
  install_dcos_cli
else
  INSTALLED_VERSION="$(installed_version)"
  if [ "$DCOS_CLI_VERSION" != "$INSTALLED_VERSION" ]; then
    echo "DC/OS CLI has version $INSTALLED_VERSION, want $DCOS_CLI_VERSION. Upgrading"
    rm -rf "$DCOS_CLI"
    install_dcos_cli
  else
    echo "Using cached DC/OS CLI $INSTALLED_VERSION"
  fi
fi

Now let’s setup a basic Marathon service. We’ll use a stock nginx image as a stand in for a web service. Our service will have one instance with 1 CPU and 512MB of RAM and will map a random port on the host to port 80 in the container.

services/hello-world/marathon.jsonlink
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
{
  "id": "hello-world",
  "cpus": 1,
  "mem": 512,
  "instances": 1,
  "container": {
    "type": "DOCKER",
    "docker": {
      "image": "nginx:1.11-alpine",
      "network": "BRIDGE",
      "portMappings": [
        {
          "containerPort": 80,
          "hostPort": 0,
          "protocol": "tcp"
        }
      ]
    }
  }
}

The next step is to add a small script to send the manifest to DC/OS using the CLI tool. The Marathon API has separate endpoints for creating and updating a service so we have to check if a service already exists before doing the right thing. We’re using --force when updating to override any previous, potentially faulty deployment.

scripts/ci/marathon-deploylink
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#!/usr/bin/env bash

set -euo pipefail

if [[ ! -z ${VERBOSE+x} ]]; then
  set -x
fi

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"

SERVICE="$1"
SERVICE_MANIFEST="$SERVICE/marathon.json"
# Marathon automatically prefixes service names with a /. We use the slash-less version in the manifest.
SERVICE_ID="/$(jq -r '.id' "$SERVICE_MANIFEST")"

# This returns true if the service currently exists in Marathon (e.g. needs to be updated instead of created)
service_exists() {
  local service_id="$1"

  dcos marathon app list --json | jq  -r '.[].id' | grep -Eq "^$service_id$"
}

if service_exists "$SERVICE_ID"; then
  dcos marathon app update "$SERVICE_ID" < "$SERVICE_MANIFEST" --force
else
  dcos marathon app add < "$SERVICE_MANIFEST"
fi

We can now add the deployment section to the circle.yml file.

circle.ymllink
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
machine:
  environment:
    PATH: $HOME/.cache/bin:$PATH

dependencies:
  override:
    - scripts/ci/install-dcos-cli

  cache_directories:
    - "~/.cache"

test:
  override:
    - echo "You'd normally run your tests here"

deployment: # <- Add the deployment section
  master:
    branch: master
    commands:
      - scripts/ci/marathon-deploy services/hello-world

We’re almost there, the last thing we have to resolve is how to authenticate the DC/OS CLI to the cluster. Without that we can’t call the APIs we need to deploy our service.

DC/OS Community Edition uses an OAuth authentication flow backed by Auth0. This is used for both the browser-based authentication to get access to the admin dashboard as well as for the CLI tool. In the latter case, the user has to follow a slightly different browser-based flow yielding a token that is then provided to the CLI.

DC/OS CE Authentication Flow

In a CI/CD setting, anything that requires a manual user intervention is a non-starter. Enter dcos-login, the tool that we’ve created to solve this problem. Given a set of Github credentials it will replace the human component in the login flow and let you run the DC/OS CLI in your CI environment.

We recommend creating a separate “service” Github account just for that purpose. Once that’s done you can set GH_USERNAME and GH_PASSWORD environment variables in CircleCI to the username and password for that account.

Just like the DC/OS CLI, we need to pull down dcos-login to our build container.

scripts/ci/install-dcos-loginlink
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
#!/usr/bin/env bash

set -euo pipefail

if [[ ! -z ${VERBOSE+x} ]]; then
  set -x
fi

BINS="$HOME/.cache/bin"

DCOS_LOGIN="$BINS/dcos-login"

# Returns the version of the currently installed dcos-login binary. e.g v0.24
installed_version() {
  dcos-login --version 2>&1
}

# Returns the version of the latest release of the dcos-login binary. e.g v0.24
latest_version() {
  curl -sSL https://api.github.com/repos/Originate/dcos-login/releases/latest | jq -r '.name'
}

# Downloads the latest version of the dcos-login binary for linux to the cache
install_dcos_login() {
  mkdir -p "$BINS"

  LATEST_RELEASE="$(curl -sSL https://api.github.com/repos/Originate/dcos-login/releases/latest)"
  DOWNLOAD_URL=$(jq -r '.assets[] | select(.name == "dcos-login_linux_amd64") | .url' <<< "$LATEST_RELEASE")

  curl -sSL -H 'Accept: application/octet-stream' "$DOWNLOAD_URL" -o "$DCOS_LOGIN"
  chmod u+x "$DCOS_LOGIN"
}

# Install dcos-login if it's missing. If it's present, upgrade it if needed otherwise do nothing
if [ ! -e "$DCOS_LOGIN" ]; then
  echo "dcos-login not found. Installing"
  install_dcos_login
else
  INSTALLED_VERSION="$(installed_version)"
  LATEST_VERSION="$(latest_version)"
  if [ "$LATEST_VERSION" != "$INSTALLED_VERSION" ]; then
    echo "dcos-login has version $INSTALLED_VERSION, latest is $LATEST_VERSION. Upgrading"
    rm -rf "$DCOS_LOGIN"
    install_dcos_login
  else
    echo "Using cached dcos-login $INSTALLED_VERSION"
  fi
fi

Next we need to use dcos-login to authenticate the DC/OS CLI. You’ll also need to provide the URL to your DC/OS cluster in the CLUSTER_URL environment variable.

scripts/ci/loginlink
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#!/usr/bin/env bash

set -euo pipefail

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"

# Check that all required variables are set
for NAME in GH_USERNAME GH_PASSWORD; do
  eval VALUE=\$$NAME
  if [[ -z ${VALUE+x} ]]; then
    echo "$NAME is not set, moving on"
    exit 0
  fi
done

if [[ ! -z ${VERBOSE+x} ]]; then
  set -x
fi

CLUSTER_URL="$1"

# Setup the DC/OS CLI
## Point it to the DC/OS cluster. CLUSTER_URL is the URL of the admin dashboard
dcos config set core.dcos_url "$CLUSTER_URL"

## Set the ACS token. dcos-login reads GH_USERNAME and GH_PASSWORD from the environment automatically
dcos config set core.dcos_acs_token "$(dcos-login --cluster-url "$CLUSTER_URL")"

Finally, tie it all together in the circle.yml file.

circle.ymllink
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
machine:
  environment:
    PATH: $HOME/.cache/bin:$PATH

dependencies:
  override:
    - scripts/ci/install-dcos-cli
    - scripts/ci/install-dcos-login-cli # <- Install the dcos-login tool

  cache_directories:
    - "~/.cache"

test:
  override:
    - echo "You'd normally run your tests here"

deployment:
  master:
    branch: master
    commands:
      - scripts/ci/login "$CLUSTER_URL" # <- Authenticate the DC/OS CLI
      - scripts/ci/marathon-deploy services/hello-world

That’s it, you can now deploy your services from CircleCI to DC/OS Community Edition. The dcos-login tool is free and open source. All the code in this blog post can be found in this example project.

FAQ

What about private docker registries?

Marathon lets you pull docker images from private registries with a bit of configuration. You need to tar up the .docker directory of an authenticated host and instruct Marathon to pull down that archive when launching a new instance of a service:

1
2
3
4
5
"fetch": [
  {
    "uri": "https://some-s3-bucket.s3.amazonaws.com/docker-hub.tar.gz"
  }
]

How do I pass configuration / secrets to my service?

Couple of strategies here:

  • Set the environment variable in CircleCI and then use sed or envsubst to replace placeholders in your marathon.json.
  • Store a subset of your marathon.json with your configuration / secrets on S3 or similar. Pull down that file (which might be specific per environment) at deploy time and use jq to merge it into the main marathon.json manifest for your service. You can use something like jq -s '.[0] * .[1]' marathon.json secrets.json.

How do I assign a DNS name to my service?

If you’re using Marathon LB you can add the following section to your marathon.json:

1
2
3
4
5
"labels": {
  "HAPROXY_GROUP": "external",
  "HAPROXY_0_VHOST": "hello-world.yourdomain.com",
  "HAPROXY_0_BACKEND_HTTP_OPTIONS": "  acl is_proxy_https hdr(X-Forwarded-Proto) https\n  redirect scheme https unless { ssl_fc } or is_proxy_https\n"
}

The HAPROXY_0_VHOST value instructs Marathon LB to map the first port in your port mapping (index 0, HTTP in our case) to that virtual host. You should have an entry in your DNS zone pointing *.yourdomain.com to the public DC/OS nodes running Marathon LB

How do I make sure that my service is alive?

You can instruct Marathon to perform a health check on your behalf by adding the following to your marathon.json:

1
2
3
4
5
6
7
8
9
10
"healthChecks": [
  {
    "path": "/",
    "protocol": "HTTP",
    "portIndex": 0,
    "gracePeriodSeconds": 10,
    "intervalSeconds": 20,
    "maxConsecutiveFailures": 3
  }
]

As with the virtual host above portIndex corresponds to the index of the port in the portMappings section.

Thoughts About Syntax

This is an opinion piece by one of our employees, and not an official statement of Originate. At Originate we celebrate diversity of opinions around technology as one of our greatest strengths, and we encourage our employees to share their ideas on our blog.

These are amazing times to be a software developer. We have access to a vast and multi-faceted ecosystem of very well thought out programming languages crafted by masters. More are added regularly, and they are typically free to use and adapt.

As somebody who looks at a lot of source code every day, needing to evaluate its essence, architecture, and structure quickly, I have learned that less is often more when it comes to syntax. In my opinion, this is especially true for infrastructure elements like paretheses, curly braces, or semicolons. This blog post is an attempt at comparing their costs and benefits somewhat objectively.

Before we dive into this, let’s remind ourselves that the #1 reason for using a programming language is enjoying using it and getting stuff done. Shipping things. Most languages achieve that, truly awesome software has been written in many different stacks, and that’s a very good thing. With that said, if you are into a bit of recreational bikeshedding about programming language syntax, and you don’t just want to stick to what you and others are used to, read on!

Upfront some simple rules that I think we all can agree on, which we will use to judge the different approaches:

  • simpler is better: if there are two alternatives that provide more or less the same functionality, and one is simpler, easier, or smaller, then that is the preferred alternative

  • robustness: the alternative that provides less rope for people to hang themselves with, i.e. less possibilities to mess up wins

  • the signal-to-noise ratio of code should be high. This means most of the source code should describe business functionality. Boilerplate (code that is just there to make the machine work) should be minimal. Said another way, code should express developer intent as directly as possible, without too much infrastructure around it.

With that in mind, let’s look at a few common syntax elements and see if we can bring some clarity into the debates around them. Maybe we can even come up with an idea for how an ideal programming language syntax looks like, if there is such a thing. Or maybe it’s all a wash. To keep things easy, the examples are focussed around small and simple languages like JavaScript/CoffeeScript, Go, Python, Java, etc. Let’s dig in!

Curly braces

Curly braces replaced the even older begin and end statements in Algol, Pascal, and friends. Which was a big step forward since it saves a good amount of typing without losing expressiveness. The question is, are curly braces the final step, or can we optimize things even more here? Let’s look at some code. What does this snippet do:

1
function max(a,b){log(a,b);if(a>=b){return a}else{return b};

Did you spot the typo? Hard to read, right? That’s because this code solely relies on delimiters like curly braces, parentheses, and semicolons to describe code structure. Let’s format this more human-friendly by writing each statement on its own line and adding identation:

1
2
3
4
5
6
7
function max(a, b) {
  log(a, b);
  if (a >= b) {
    return a
  } else {
    return b
};

So much more readable! And it’s much easier to see that a closing curly brace is missing at the end. This experiment demonstrates that people use indentation as the primary mechanism to infer code structure, and use curly braces only as the backup, in edge cases like when the indentation is messed up and not obvious.

Braces also introduce extra categories of errors. Forgetting to close a curly brace will cause a program to behave in unexpected ways, even though the actual statements in it (the code that computes arguments into results) is sound. Indenting code correctly but misplacing curly braces results in code that does something else than what most programmers would expect when only briefly looking at it.

Code with curly braces still has to be indented properly. When we do that, this code uses two different ways of expressing code structure: indentation and curly braces. Humans mostly look at the indentation, and “filter out” the curly braces. Parsers look only at the braces and ignore indentation. Because both of these stakeholders should always agree 100% on what the code is supposed to do, these two ways of expressing code structure must always be in perfect sync with each other. As an example, this code here compiles and runs fine, but is misleading and hard to maintain since it is improperly formatted, i.e. the indentation is wrong:

1
2
3
4
5
6
7
8
9
10
function max(a, b) {
  if (a >= b) {
    if (a > 0) {
      return a
  } else {
    return b
    } else {
      return a
  }
}

So if indentation is the most important aspect for humans, humans filter out curly braces most of the time, and any time indentation differs from code structure we end up with misleading code, what would happen if we got rid of curly braces at all and only used indentation?

1
2
3
4
5
6
function max(a, b)
  log(a, b)
  if (a >= b)
    return a
  else
    return b

This is still clear and unambiguously structured code, understandable by both humans and parsers. It is also horizontally and vertically more compact, i.e. uses less lines. It is more clear, and it avoids a number of issues like forgotten opening or closing curly braces, or whether the opening curly brace should be on a separate line or not. Because indentation errors are parse errors for white space sensitive languages, we can rest assured the indentation is always correct, and we don’t need formatters or linters to correct it for us, which people might or might not run at all times.

At best, when used correctly, curly braces are just there, don’t add much readability, and get filtered out by humans, since readability is mostly provided by indentation. At worst, when curly braces don’t match indentation, human developers can be seriously mislead. While proper indentation is a necessity, as we have seen above, curly braces are redundant and unnecessary. That’s why we call them line noise. They are just more stuff to type, more stuff to verify, and exist mostly to satisfy our habits at this point. They are a legacy, and according to our rules above we are better off simply leaving them out moving forward.

Semicolon to terminate lines

What is the difference between these two code blocks?

1
2
3
4
5
6
function max(a, b)
  log(a, b);
  if (a >= b)
    return a;
  else
    return b;

and

1
2
3
4
5
6
function max(a, b)
  log(a, b)
  if (a >= b)
    return a
  else
    return b

Nothing. Nobody needs semicolons at the end of lines. Nobody misses them if they aren’t there. Their only use is to separate multiple expressions on the same line. So they should be optional.

Parentheses around expressions

Next, what about parentheses around expressions? Let’s remove them from our example code block and see what happens:

1
2
3
4
5
6
function max a, b
  log a
  if a >= b
    return a
  else
    return b

It’s still clear that a and b are parameters to the function max, that we log a on the next line, and then check if a is larger or equal to b.

Similar to curly braces, parentheses are a redundant way of expressing code structure and developer intent. The real world isn’t always as simple as this example, so more complex code can definitely benefit from parentheses to group sub-expressions. But there seems no need to enforce them being there for simple cases. Let’s make them optional, so that we can simplify our code where possible, without giving up the ability to structure more complex expressions unambiguously.

Comments

The most widely used ways to add comments to source code are via //, /* */, and #. Let’s look at C-style comments first:

1
2
3
4
5
6
7
8
9
10
// a single-line comment

/* a
 * multi-line
 * comment
 */

/**
 * a JavaDoc or JsDoc comment
 */

Now let’s look at comments via a single character:

1
2
3
4
5
6
7
8
# a single-line comment

# a
# multi-line
# comment

##
# a JavaDoc or JsDoc comment

Both code snippets do the same, and are equally well readable. The first version uses 19 comment characters and requires indentation of subsequent lines in multi-line comments via a space (which I count as characters here, since they need to be typed and be verified as well). The second version only uses 7 comment characters, without any need for indentation, and results in less lines of code.

According to our rules the second version wins.

Spaces vs Tabs

The arguments for using tabs to indent code are:

  • because it saves you characters (1 tab vs 2 or 4 spaces), making the source code file smaller
  • because it allows different people to indent their code in different amounts, by configuring the tab size of their text editors
  • it avoids bikeshed debates how deep code should be indented

The first argument comes from limitations of computer systems 60 years ago and is no longer valid. The arguments against tabs are:

  • the default tab size (8 spaces) is clearly too much. This means EVERY PERSON in the world who looks at code now has to configure EVERY TOOL and EVERY WEBSITE they use to write or view code to their preferred tab size. On EVERY MACHINE they use.
  • there are tools that don’t allow to configure the tab size, for example many websites with embedded code snippets or many low-level terminal commands. These things that are often used to dispay source code.
  • the tab character is hard to insert into input fields, for example in search-and-replace forms, since it is also used to switch to the next input field. People often have to copy-and-paste a tab characters from their source code in order to search for it.
  • not all keyboards have a TAB key. For example most keyboards on mobile devices are lacking it. Mobile devices play a larger and larger role in our lives, including in software development. I review a good amount of code on my mobile device, and code reviewers sometimes need to provide code snippets as examples.
  • the tab character was not invented to indent source code. It exists to make it easier to format numerical content into columns and tables using tab stops. There are no columns, tables, or tab stops in source code.
  • Formatting using tabs doesn’t support many ways of formatting code in readable ways.

Let’s look at a few examples around the last point. One is formatting function calls with larger lists of complex parameters:

1
2
3
4
5
draw_circle( canvas_width / 1024 * 10,
             canvas_height / 786 * 15,
             canvas_width / 1024 * 5,
             display.colors.primary,
             round(canvas_width / 1024 * 3.5) )

This code draws a circle and calculates the parameters for it inline. Because the arguments are so complex, we want to put each one on a separte line to separate them visually. Putting them behind the function name makes it visually clear that they are parameters for this function. The equivalent code using tabs needs to move all arguments to the next line:

1
2
3
4
5
6
draw_circle(
        canvas_width / 1024 * 10,
        canvas_height / 786 * 15,
        canvas_width / 1024 * 5,
        display.colors.primary,
        round(canvas_width / 1024 * 3.5))

The problem with this way of formatting is that the arguments to draw_circle now look too much like an indented code block. This is confusing, especially if there would also be an indented code block right below it.

Another – quite common – use case where tab-based formatting falls short is method chaining: With spaces the code can be formatted nice and clear:

1
2
3
4
fancy_box = new Square().fill('red')
                        .move(-10, 20)
                        .rotate(35)
                        .zoom(2.8)

This makes it clear that we do a bunch of things with a Square instance. With tabs, we again have to move the call chain to the next line, making it look too much like a nested code block:

1
2
3
4
5
fancy_box = new Square()
        .fill('red')
        .move(-10, 20)
        .rotate(35)
        .zoom(2.8)

Based on these examples, we can conclude that using tabs to indent source code may sound good in theory, but only works for simple code in environments where only few tools are used.

Let’s also evaluate the pros and cons of using spaces. Pros:

  • code looks exactly the same in all tools and environments
  • more flexibility around expressing complex code structures elegantly
  • works with all devices and keyboards

Cons:

  • opens up bikeshed debates about how many spaces to use to indent code
  • opens up bikeshed debates around what “elegant” code formatting looks like
  • formatting code with tabs feels a bit more semantic in nature, while using spaces feels more presentational

To finish this, let’s talk about how many spaces should be used to indent code. The most common candidates are 2 and 4, since we can (hopefully) agree that 8 is too large and 1 is not enough. Given that many communities like to limit the width of code (to something like 80 or 100 characters), 2 seems preferrable since it leaves more horizontal space for code. So the question is, are 2 spaces enough indentation? Let’s look at our initial example to find out:

1
2
3
4
5
6
function max a, b
  log a, b
  if a >= b
    return a
  else
    return b

vs

1
2
3
4
5
6
function max a, b
    log a, b
    if a >= b
        return a
    else
        return b

Both styles work, 2 spaces uses less horizontal space, so it wins.

Final thoughts

Following our rules has led us on a journey into minimal syntax, i.e. syntax that minimizes line noise and tries to get out of the way to let the business logic and developer intent shine through better. This is something that most of us deal with every day, but often don’t spend much time thinking about. Many modern languages (i.e. those designed in the last two decades) are adopting many of these elements in some form, with certainly more to come in the future.

Hopefully this was an enjoyable read, and some food for thought when you decide which language to learn next, or design your own language or data format. No matter what language you end up using, happy coding! :)

Cheat Codes for Contravariance and Covariance

I used to have nightmares about understanding variance. I thought things would get better when someone showed me this explanation…

panda hiding in an oreo factory

(image from rice.edu)

…but afterwards the nightmares got worse, since now they included clowns and burning pandas. I stayed up for days.

The goal of this post is to help you become more familiar with variance so that it is easier to understand in code you are reading, and so that you can use it when appropriate. The goal is not to promote the usage of variance (or overusage). Variance is a tool that can be powerful, but keep in mind that most parameterized types you define will be invariant.

Note that if you already know what variance is, and you just want a quick reference to remind you how to tell the difference between co/contravariance, refer to the cheatsheet at the bottom of this post

Why do we even need variance?

Variance is how we determine if instances of parameterized types are subtypes or supertypes of one another. In a statically typed environment with subtyping and generics, this boils down to the compiler needing to determine when one type can be substituted for another type in an expression. For simple types, this is straightforward:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
trait Life

class Bacterium extends Life

trait Animal extends Life {
  def sound: String
}

class Dog(name: String, likesFrisbees: Boolean) extends Animal {
  val sound = "bark"
}

class Cat(name: String, likesHumans: Boolean) extends Animal {
  val sound = "meow"
}

def whatSoundDoesItMake(animal: Animal): String =
  animal.sound

A Dog is an Animal, and so is a Cat, so anything that expects an Animal can also take a Dog or Cat. This is a classic example of the Liskov substitution principle.

What are we really doing here though? We’re trying to determine when some type T can safely be substituted for a type U in an expression that expects a U. The expression that expects a U only cares that it gets a U for 2 reasons:

  • The expression calls methods on a U
  • The expression passes a U to another expression that expects a U

An expression that passes a U along eventually makes its way to an expression that does call methods on a U (or stores it for later method calling). So we’re left with only caring about whether something can be substited for another thing, because we want to make sure that it is safe to call methods on that thing (and by “safe” I mean “will never fail at runtime”). This is what we want our compiler to do: make sure that we never call the method something.sound on a type that does not have the method sound defined.

A wild variant appears

Looking at a type that has parameters, it is no longer obvious when substitution within an expression is allowed. In other words, if a function takes an argument of type ParametricType[T], is it safe to pass it a ParametricType[U]? This is what variance is all about.

Covariance

Container types are the best example of something that is covariant, so let’s look at an example:

1
2
3
4
5
6
7
8
9
10
11
12
val dogs: Seq[Dog] = Seq(
  new Dog("zoe", likesFrisbees = true),
  new Dog("james vermillion borivarge III", likesFrisbees = false)
)

val cats: Seq[Cat] = Seq(
  new Cat("cheesecake", likesHumans = true),
  new Cat("charlene", likesHumans = false)
)

def whatSoundsDoTheyMake(animals: Seq[Animal]): Seq[String] =
  animals map (_.sound)

Our method whatSoundsDoTheyMake expects a Seq[Animal], and it calls the method .sound on those animals. We know that all Animals have the method .sound defined on them, and we know that we are mapping over a list of Animals, so it’s totally OK to pass whatSoundsDoTheyMake a Seq[Dog] or a Seq[Cat].

Dog <: Animal implies Seq[Dog] <: Seq[Animal]

Notice where the method call on the animals actually happens. It doesn’t happen within the definition of Seq. Rather, it happens inside of a function that receives the Animal as an argument. Now consider what would happen if we tried to pass a Seq[Life] to whatSoundsDoTheyMake. First off, the compiler wouldn’t allow this because it’s unsafe: error: value sound is not a member of Life. If it were allowed though, then you could attempt to call bacterium.sound, even though the method doesn’t exist on that object. Note that in a dynamically typed language you could try to do this, but you’d get a runtime exception like TypeError: Object #<Bacterium> has no method 'sound'.

Interestingly, the real problem doesn’t occur within Seq; it occurs later on down the chain. The reason is that a generic type makes guarantees to other types and functions that it interacts with. Declaring a class as covariant on type T is equivalent to saying “if you call functions on me, and I provide you with an instance of my generic type T, you can be damn sure that every method you expect will be there”. When that guarantee goes away, all hell breaks loose.

Contravariance

Functions are the best example of contravariance (note that they’re only contravariant on their arguments, and they’re actually covariant on their result). For example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
class Dachshund(
  name: String,
  likesFrisbees: Boolean,
  val weinerness: Double
) extends Dog(name, likesFrisbees)

def soundCuteness(animal: Animal): Double =
  -4.0/animal.sound.length

def weinerosity(dachshund: Dachshund): Double =
  dachshund.weinerness * 100.0

def isDogCuteEnough(dog: Dog, f: Dog => Double): Boolean =
  f(dog) >= 0.5

Should we be able to pass weinerosity as an argument to isDogCuteEnough? The answer is no, because the function isDogCuteEnough only guarantees that it can pass, at most specific, a Dog to the function f. When the function f expects something more specific than what isDogCuteEnough can provide, it could attempt to call a method that some Dogs don’t have (like .weinerness on a Greyhound, which is insane).

What about soundCuteness, can we pass that to isDogCuteEnough? In this case, the answer is yes, because even if isDogCuteEnough passes a Dog to soundCuteness, soundCuteness takes an Animal, so it can only call methods that all Dogs are guaranteed to have.

Dog <: Animal implies Function1[Animal, Double] <: Function1[Dog, Double]

A function that takes something less specific as an argument can be substituted in an expression that expects a function that takes a more specific argument.

Conclusion

Enforcing safety by following expression substitution rules for parameterized types is a complex but super useful tool. It constrains what we can do, but these are things that we shouldn’t do, because they can fail at runtime. Variance rules, and type safety in general, can be seen as a set of restrictions that force us to engineer solutions that are more robust and logically sound. It’s like how bones and muscles are a set of constraints that allow for extremely complex and functional motion. You’ll never find a boneless creature that can move like this:

extremely complex and functional motion

Cheatsheet

Here is how to determine if your type ParametricType[T] can/cannot be covariant/contravariant:

A type can be covariant when it does not call methods on the type that it is generic over. If the type needs to call methods on generic objects that are passed into it , it cannot be covariant.

Archetypal examples: Seq[+A], Option[+A], Future[+T]

A type can be contravariant when it does call methods on the type that it is generic over. If the type needs to return values of the type it is generic over, it cannot be contravariant.

Archetypal examples: Function1[-T1, +R], CanBuildFrom[-From, -Elem, +To], OutputChannel[-Msg]

Rest assured, the compiler will inform you when you break these rules:

1
2
3
4
5
6
7
8
9
10
11
trait T[+A] { def consumeA(a: A) = ??? }
// error: covariant type A occurs in contravariant position
//   in type A of value a
//       trait T[+A] { def consumeA(a: A) = ??? }
//                                  ^

trait T[-A] { def provideA: A = ??? }
// error: contravariant type A occurs in covariant position in
//  type => A of method provided
//       trait T[-A] { def provideA: A = ??? }
//                         ^

Write Modern Asynchronous Javascript Using Promises, Generators, and Coroutines

Over the years, “Callback Hell” has been cited as one of the most common anti-patterns in Javascript to manage concurrency. Just in case you’ve forgotten what that looks like, here is an example of verifying and processing a transaction in Express:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
app.post("/purchase", (req, res) => {
  user.findOne(req.body, (err, userData) => {
    if (err) return handleError(err);
    permissions.findAll(userData, (err2, permissions) => {
      if (err2) return handleError(err2);
      if (isAllowed(permissions)) {
        transaction.process(userData, (err3, confirmNum) => {
          if (err3) return handleError(err3);
          res.send("Your purchase was successful!");
        });
      }
    });
  });
});

Promises were supposed to save us…

I was told that promises would allow us Javascript developers to write asynchronous code as if it were synchronous by wrapping our async functions in a special object. In order to access the value of the Promise, we call either .then or .catch on the Promise object. But what happens when we try to refactor the above example using Promises?

1
2
3
4
5
6
7
8
9
10
11
12
13
// all asynchronous methods have been promisified
app.post("/purchase", (req, res) => {
  user.findOneAsync(req.body)
    .then( userData => permissions.findAllAsync(userData) )
    .then( permissions => {
      if (isAllowed(permissions)) {
        return transaction.processAsync(userData);
        // userData is not defined! It's not in the proper scope!
      }
    })
    .then( confirmNum => res.send("Your purchase was successful!") )
    .catch( err => handleError(err) )
});

Since each function inside of the callback has its own scope, we cannot access the user object inside of the second .then callback. So after a little digging, I couldn’t find an elegant solution, but I did find a frustrating one:

Just indent your promises so that they have proper scoping.

Indent my promises!? So its back to the Pyramid of Doom now?

1
2
3
4
5
6
7
8
9
10
11
12
13
app.post("/purchase", (req, res) => {
  user.findOneAsync(req.body)
    .then( userData => {
      return permissions
        .findAllAsync(userData)
        .then( permissions => {
          if (isAllowed(permissions)) {
            return transaction.processAsync(userData);
          }
        });
  }).then( confirmNum => res.send("Your purchase was successful!"))
    .catch( err => handleError(err) )
});

I would argue that the nested callback version looks cleaner and is easier to reason about than the nested promise version.

Async Await Will Save Us!

The async and await keywords will finally allow us to write our javascript code as though it is synchronous. Here is code written using those keywords coming in ES7:

1
2
3
4
5
6
7
8
app.post("/purchase", async function (req, res) {
  const userData = await user.findOneAsync(req.body);
  const permissions = await permissions.findAllAsync(userData);
  if (isAllowed(permissions)) {
    const confirmNum = await transaction.processAsync(userData);
    res.send("Your purchase was successful!")
  }
});

Unfortunately the majority of ES7 features including async/await have not been implemented in Javascript runtimes and therefore, require the use of a transpiler. However, you can write code that looks exactly like the code above using ES6 features that have been implemented in most modern browsers as well as Node version 4+.

The Dynamic Duo: Generators and Coroutines

Generators are a great metaprogramming tool. They can be used for things like lazy evaluation, iterating over memory intensive data sets and on demand data processing from multiple data sources using a library like RxJs. However, we wouldn’t want to use generators alone in production code because it forces us to reason about a process over time and each time we call next, we jump back to our generator like a GOTO statement. Coroutines understand this and remedy this situation by wrapping a generator and abstracting away all of the complexity.

The ES6 version using Coroutines

Coroutines allow us to use yield to execute our asynchronous functions line by line, making our code look synchronous. It’s important to note that I am using the Co library. Co’s coroutine will execute the generator immediately where as Bluebird’s coroutine will return a function that you must invoke to run the generator.

1
2
3
4
5
6
7
8
9
10
11
import co from 'co';
app.post("/purchase", (req, res) => {
  co(function* () {
    const person = yield user.findOneAsync(req.body);
    const permissions = yield permissions.findAllAsync(person);
    if (isAllowed(permissions)) {
      const confirmNum = yield transaction.processAsync(user);
      res.send("Your transaction was successful!")
    }
  }).catch(err => handleError(err))
});

If there is an error at any step in the generator, the coroutine will stop execution and return a rejected promise. Let’s establish some basic rules to using coroutines:

  • Any function to the right of a yield must return a Promise.
  • If you want to execute your code now, use co.
  • If you want to execute your code later, use co.wrap.
  • Make sure to chain a .catch at the end of your coroutine to handle errors. Otherwise, you should wrap your code in a try/catch block.
  • Bluebird’s Promise.coroutine is the equivalent to Co’s co.wrap and not co on it’s own.

What if I want to run multiple processes concurrently?

You can either use objects or arrays with the yield keyword and then destructure the result.

With the Co library:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import co from 'co';
// with objects
co(function*() {
  const {user1, user2, user3} = yield {
    user1: user.findOneAsync({name: "Will"}),
    user2: user.findOneAsync({name: "Adam"}),
    user3: user.findOneAsync({name: "Ben"})
  };
).catch(err => handleError(err))

// with arrays
co(function*() {
  const [user1, user2, user3] = yield [
    user.findOneAsync({name: "Will"}),
    user.findOneAsync({name: "Adam"}),
    user.findOneAsync({name: "Ben"})
  ];
).catch(err => handleError(err))

With the Bluebird library:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// with the Bluebird library
import {props, all, coroutine} from 'bluebird';

// with objects
coroutine(function*() {
  const {user1, user2, user3} = yield props({
    user1: user.findOneAsync({name: "Will"}),
    user2: user.findOneAsync({name: "Adam"}),
    user3: user.findOneAsync({name: "Ben"})
  });
)().catch(err => handleError(err))

// with arrays
coroutine(function*() {
  const [user1, user2, user3] = yield all([
    user.findOneAsync({name: "Will"}),
    user.findOneAsync({name: "Adam"}),
    user.findOneAsync({name: "Ben"})
  ]);
)().catch(err => handleError(err))

Libraries that you can use today:

Releasing Git Town

We are excited to release Git Town today! Git Town is an open source command-line tool that helps keep software development productive as project and team sizes scale. It provides a few additional high-level Git commands. Each command implements a typical step in common software development workflows.

Check out the screencast to get an overview of what these extra Git commands are and how to use them.

The problem: Software development doesn’t scale well

Many software team runs into a few typical scalability issues:

  • More developers means more frequent commits to the main branch. This makes feature branches outdate faster, which results in bigger, uglier merge conflicts when finally getting the LGTM and merging these feature branches into the main branch. Time spent resolving merge conflicts isn’t spent writing new features.
  • These merge conflicts increase the likelihood of breaking the main development branch. A broken main branch affects the productivity of the whole team.
  • Old branches are often not deleted and accumulate, making it harder to maintain an overview of where current development is happening, and where unmerged changes are.

These are only a few of the many headaches that diminish developer productivity and happiness as development teams grow. These issues are almost completely independent of the particular workflows and how/where the code is hosted.

The approach: Developer discipline

Fortunately, most of these issues can be addressed with more discipline:

  • Always update your local Git repo before cutting a new feature branch.
  • Synchronize all your feature branches several times per day with ongoing development from the rest of the team. This keeps merge conflicts small and easily resolvable.
  • Before merging a finished feature branch into the main development branch, update the main development branch and merge it into your feature branch. Doing so allows you to resolve any merge conflicts on the feature branch, and tests things before merging into the main branch. This keeps the main branch green.
  • Always remove all Git branches once you are done with them, from both your local machine as well as the shared team repository.

The solution: A tool that goes the extra mile for you

It is difficult to follow these practices consistently, because Git is an intentionally generic and low-level tool, designed to support many different ways of using it equally well. Git is a really wonderful foundation for robust and flexible source code management, but it does not provide direct high-level support for collaborative software development workflows. Using this low-level tool for high-level development workflows will therefore always be cumbersome and inefficient. For example:

  • creating a new, up-to-date feature branch in the middle of active development requires up to 6 individual Git commands.
  • updating a feature branch with the latest changes on the main branch requires up to 10 Git commands, even if there are no merge conflicts
  • properly merging a finished feature branch into the main development branch after getting the LGTM in the middle of working on something else requires up to 15 Git commands.

Keeping feature branches small and focused means more feature branches. Running all these commands on each feature branch every day easily leads to each developer having to run a ceremony of hundreds of Git commands each day!

While there are a number of tools like Git Flow that focus on supporting a particular Git branching model, there is currently no natural extension of the Git philosophy towards generic, robust, high-level teamwork support.

We are excited to release exactly that today: Git Town! It provides a number of additional high-level Git commands. Each command implements a typical step in most common team-based software development workflows (creating, synching, and shipping branches). Designed to be a natural extension to Git, Git Town feels as generic and powerful as Git, and supports many different ways of using it equally well. The screencast gives an overview of the different commands, and our tutorial a broader usage scenario.

The awesomeness: Made for everybody, as Open Source

Git Town is for beginners and experts alike. If you are new to Git, and just want it to stay out of your way and manage your code, let Git Town provide the Git expertise and do the legwork for you. If, on the other hand, you are a Git ninja, and want to use it in the most effective manner possible, let Git Town automate the repetitive parts of what you would type over and over, with no impact on your conventions, workflow, and ability to do things manually.

Git Town is open source, runs everywhere Git runs (it’s written in Bash), is configurable, robust, well documented, well tested, has proven itself on everything from small open source projects to large enterprise code bases here at Originate, and has an active and friendly developer community.

Please try it out, check out the screencast or the tutorial, let us know how we can improve it, tell your friends and coworkers about it, or help us build it by sending a pull request!

FAQ

Does this force me into any conventions for my branches or commits?
Not at all. Git Town doesn’t require or enforce any particular naming convention or branch setup, and works with a wide variety of Git branching models and workflows.

Which Git branching models are supported by Git Town?
Git Town is so generic that it supports all the branching models that we are aware of, for example GitHub Flow, GitLab Flow, Git Flow, and even committing straight into the master branch.

How is this different from Git Flow?
Git Flow is a Git extension that provides specific and opinionated support for the powerful Git branching model with the same name. It doesn’t care too much about how you keep your work in sync with the rest of the team.

Git Town doesn’t care much about which branching model you use. It makes you and your team more productive by keeping things in sync, and it keeps your Git repo clean.

It is possible (and encouraged) to use the two tools together.

Is it compatible with my other Git tools?
Yes, we try to be good citizens. If you run into any issues with your setup, please let us know!

Managing Data Classes With Ids

At Originate, we have worked on a number of medium- to large-scale Scala projects. One problem we continuously find ourselves tackling is how to represent the data in our system in a way that is compatible with the idea that sometimes that data comes from the client, and sometimes it comes from the database. Essentially, the problem boils down to: how do you store a model’s id alongside it’s data in a type-safe and meaningful way?

The Problem

Consider your user model

1
case class User(email: String, password: String)

When a new user signs up, they send a User to your system, which you then store in the database. At this point, a user now has an Id. Should the id be stored inside the User model? If so, it would have to be an Option[Id]. Perhaps, instead of storing optional ids in your user model, you prefer to just have two data classes. One case class to represent data received from the client, and one to represent data received from the database.

1
2
3
4
5
6
// option 1: duplicate data classes
case class User(id: Id, email: String, password: String)
case class UserData(email: String, password: String)

// option 2: optional ids
case class User(id: Option[Id], email: String, password: String)

Both options have their pros and cons. There is a 3rd option, which is the purpose of this blogpost, but let’s cover our bases first.

Duplicate Data Classes

One simple way to solve this problem is to consider that your system has two versions of your data: the version it receives from the client, and the version it receives from the database, an idea generalized by the CQRS design pattern.

Unfortunately, this adds a lot of overhead/boilerplate. Not only do you have to duplicate the amount of models you have in your system, you also have to make sure to reference the right version of that model in the right places. This can lead to a lot of confusion, not to mention the fact that with User and UserData, it’s not immediately clear what the difference is to someone new on the project.

The biggest issue here is that we lose the correlation between the two types of data. User does not know that it comes from UserData and vice versa. Even worse, if we have methods on that data, for example, something that gives us a “formatted name” for a user… either we need both User and UserData to inherit from the same trait, or we duplicate the method. Unfortunately, this pattern is clunky and annoying.

Optional Ids

We’ve tried this pattern on several large projects. On the one hand, it prevents us from having to duplicate all of our data classes, which is nice. The big problem with optional ids is that, most of the time, your data actually does have an id in it. Wrapping it with Option means that the rest of your system has to always consider that the value might not be there, even when it definitely should. This ends up producing a lot of code like this:

1
2
3
4
for {
  user <- UserService.first()
  userId <- user.id
} yield ...

It’s not the worst thing in the world, but when 80% of your code is dealing with users that have ids, it feels unnecessary. Also note that there is a hidden failure here. If your UserService.first() returns Some(user) but for whatever reason that user doesn’t have an Id, then it will look the same as if UserService.first() returned None. Programming around this is possible, but gets ugly if you use this pattern all over your codebase.

Additionally, lazy developers will say “I know that the id should be there, why can’t I just use user.id.get?”. Option#get is a slippery slope, if you make exceptions for it in your codebase, people will abuse it, and then you lose the safety of having Options in the first place. At that point you might as well not even use Option, because you are getting the Option version of an NPE and also dealing with the overhead of Option. If you have developers trying to sneak in .get, consider checking out Brian McKenna’s library: WartRemover.

Furthermore, the Optional Id pattern leads you to create a base class that all of your data classes inherit from.

1
2
3
trait BaseModel[T <: BaseModel[T]] { self: T =>
  val id: Option[Id]
}

This is so that you can create base service and data access traits that your layers inherit from. It’s worth noting that in this situation, you will likely want a method def withId(id: Id): T defined on BaseModel so that your base services/DAOs know how to promote a data class without an id (received from the client), to a data class that has an id (after being persisted). You’ll see in the next section that we can do away with all of this.

While this pattern works, and we have used it in production with success, the issue we run into is that the types and data don’t accurately reflect the concepts in our system. We want a way to say that ids are not optional when the data has been retrieved from the database, while also maintaining the relationship between data both in and outside of the database.

The Solution

After writing out the problem, and the other possible solutions, it starts to become clear that there is a better way. We know what we want:

  • There should be one class that represents a specific type of data, whether it’s from the client or the database.
  • We don’t want id’s to be optional. Data received from the database should represent that the id exists.
  • We don’t want values to ever be null.
  • Ideally we can minimize overhead (control flow, inheritance, typing overhead).

We introduce a class that contains an id and model data:

1
2
3
4
5
6
7
case class WithId[A](id: Id, model: A)

// receive data for a new user from the client
val user: User = Json.parse[User](json)

// receive data from the database
val user: WithId[User] = UserService.findByIdOrFail(userId)

This makes a lot of sense. We retain the fact that the only difference between client data and database data is that it has an id. We also avoid a good amount of overhead from the other two options (duplication, inheritance, and unnecessary complexity around ids). The main bit of overhead this introduces is extra typing when dealing with models in your service and DAO layer. While the types can get pretty nasty in some cases (our services are always asynchronous, so you may have Future[Seq[WithId[User]]]), it beats the alternatives.

Removing .model Calls

If the thought of having to do user.model.firstName feels ugly, there is a way around it using implicits:

1
2
3
object WithId {
  implicit def toModel[A](modelWithId: WithId[A]): A = modelWithId.model
}

Note that we have not tested this solution out on a large scale project, and it could add compilation time overhead.

Conclusion

Hopefully it is clear that while this is seemingly a small problem, finding the right way to model it in your system can have major implications on the cleanliness of your codebase. We have been trying the WithId pattern on a sizeable project for the last month with great results. No issues so far, and the type overhead isn’t that bad considering the additional safety it brings.

Android and CI and Gradle - a How-To

Updated 10/31/15: Since writing this, things have changed. As a result, some of the information below may be outdated and no longer applicable. For more up to date information on how to use CircleCI for your Android projects, please see the Continuous Integration section of our Android Stack Guidelines.

There are tech stacks in this world that make it dead simple to integrate a CI build system.
The Android platform is not one of them.

Although Gradle is getting better, it’s still a bit non-deterministic, and some of the fixes you’ll need will start to feel more like black magic than any sort of programming.

But fear not! It can be done!

Before we embark on our journey, you’ll need a few things to run locally:

  1. A (working) Gradle build
  2. Automated tests (JUnit, Espresso, etc.)

If you don’t have Gradle set up for your build system, it is highly recommend that you move your projects over. Android Studio has a built-in migration tool, and the Android Dev Tools website has an excellent guide on how to migrate over to the Gradle build system, whether you’re on Maven, Ant, or some unholy combination of all three.

A very general example of a build.gradle file follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
//build.gradle in /app
apply plugin: 'com.android.application'

buildscript {
  repositories {
    mavenCentral()
  }
  dependencies {
    classpath 'com.android.tools.build:gradle:1.0.1'
  }
}
//See note below
task wrapper(type: Wrapper) {
    gradleVersion = '2.1'
}

android {
  compileSdkVersion 19
  buildToolsVersion "21.1.2"

  defaultConfig {
    applicationId "com.example.originate"
    minSdkVersion 14
    targetSdkVersion 19
    versionCode 1
    versionName "1.0"

    testApplicationId "com.example.originate.tests"
    testInstrumentationRunner "android.test.InstrumentationTestRunner"
  }

  buildTypes {
    debug {
        debuggable true
    }
    release {
        debuggable false
    }
  }
}

dependencies {
  compile project(':libProject')
  compile com.android.support:support-v4:21.0.+
}

(NOTE: the Gradle Wrapper task isn’t strictly necessary, but a highly recommended way of ensuring you always know what version of Gradle you’re using – both for futureproofing and for regressions)

Check out the Android Developers website for some good explanations and samples.

Choose your weapon

At Originate, we are big fans of CircleCI. They sport a clean, easy-to-use interface and support more languages than you could possibly care about. Plus, they are free for open source Github projects!
(Other options include TravisCI, Jenkins, and Bamboo)

In this guide, we’ll be using CircleCI, but these instructions should translate readily to TravisCI as well.

Configure all the things!

In order to use CircleCI to build/test your Android library, there’s some configuration necessary. Below are some snippets of some of the basic configurations you might use. About half of this comes from the CircleCI docs and half of it comes from my blood, sweat, and tears.

At the end of this section, I’ll include a complete circle.yml file. (The complete docs for the circle.yml file is here)

Machine

First, the code:

1
2
3
4
5
machine:
  environment:
    ANDROID_HOME: /home/ubuntu/android
  java:
    version: oraclejdk6
  1. The setting of the ANDROID_HOME environment variable is necessary for the Android SDKs to function properly. It’ll also be useful for booting up the emulator in later steps.
  2. Although setting the JDK version isn’t strictly necessary, it’s nice to ensure that it doesn’t change behind-the-scenes and possibly surprise-bork your build.

Dependencies + Caching

1
2
3
4
5
6
dependencies:
  cache_directories:
    - ~/.android
    - ~/android
  override:
    - (source scripts/environmentSetup.sh && getAndroidSDK)
  1. By default, CircleCI will cache nothing. You might think this a non-issue right now, but you’ll reconsider when each build takes 10+ minutes to inform you that you dropped a semicolon in your log statement.

By caching `~/.android` and `~/android`, you can shave precious minutes off of your build time.
  1. Android provides us with a nifty command-line utility called…android (inventive!). We can use this in a little Bash script that we’ll write in just a second. For now, just know that scripts/environmentSetup.sh can be whatever you want, as can the Bash function getAndroidSDK.

Bash Scripts – a Jaunt into the CLI

Gradle is good at a lot of things, but it isn’t yet a complete build system. Sometimes, you just need some good ol’fashioned bash scripting.

In this section, we’ll download Android API 19 (Android 4.4 Jelly Bean) and create a hardware-accelerated Android AVD (Android Virtual Device – aka “emulator”) image.

Note: If android commands confuse/scare you, check out the d.android documentation.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#!/bin/bash

# Fix the CircleCI path
function getAndroidSDK(){
  export PATH="$ANDROID_HOME/platform-tools:$ANDROID_HOME/tools:$PATH"

  DEPS="$ANDROID_HOME/installed-dependencies"

  if [ ! -e $DEPS ]; then
    cp -r /usr/local/android-sdk-linux $ANDROID_HOME &&
    echo y | android update sdk -u -a -t android-19 &&
    echo y | android update sdk -u -a -t platform-tools &&
    echo y | android update sdk -u -a -t build-tools-21.1.2 &&
    echo y | android update sdk -u -a -t sys-img-x86-android-19 &&
    #echo y | android update sdk -u -a -t addon-google_apis-google-19 &&
    echo no | android create avd -n testAVD -f -t android-19 --abi default/x86 &&
    touch $DEPS
  fi
}
  1. The export PATH line is to ensure we have access to all of the Android CLI tools we’ll need later in the script.
  2. The DEPS=... is used in the if/then block to determine if CircleCI has already provided us with cached dependencies. If so, there’s no need to download anything!
  3. Note that we’re explicitly requesting the x86 version of the Android 19 emulator image (sys-img-x86-android-19). The ARM-based emulator is notoriously slow, and we should use the hardware-accelerated version if at all possible.
  4. We create the Android Virtual Device (AVD) with the line android create avd ..., with a target of Android 19 and a name of testAVD.
  5. If you need the Google APIs (e.g., Maps, Play Store, etc.), you can uncomment out the line addon-google_apis-google-19.
  6. Even though Google has released an API 21 HAXM emulator, I still recommend using an API 19 AVD. API 21’s emulator doesn’t always play nice with CircleCI.

CAVEAT – Because of the way this caching works, if you ever change which version of Android you compile/run against, you need to click the “Rebuild & Clear Cache” button in CircleCI (or use the CircleCI API). If you don’t, you’ll never actually start compiling against the new SDK. You have been warned.

You shall not pass! (until your tests have run)

This section will vary greatly depending on your testing setup, so YMMV – moreso than with the rest of this post.
This section is assuming you’re using a plain vanilla Android JUnit test suite.

1
2
3
4
5
6
7
8
9
10
11
test:
  pre:
    - $ANDROID_HOME/tools/emulator -avd testAVD -no-skin -no-audio -no-window:
      background: true
    - (./gradlew assembleDebug):
      timeout: 1200
    - (./gradlew assembleDebugTest):
      timeout: 1200
    - (source scripts/environmentSetup.sh && waitForAVD)
  override:
    - (./gradlew connectedAndroidTest)
  1. The $ANDROID_HOME/tools/emulator starts a “headless” emulator – more specifically, the one we just created.
    1a. Running the emulator from the terminal is a blocking command. That’s why we are setting the background: true attribute on the emulator command. Without this, we would have to wait anywhere between 2-7 minutes for the emulator to start and THEN build the APK, etc. This way, we kick off the emulator and can get back to building.
  2. The two subsequent ./gradlew commands use the Gradle wrapper (gradle +wrapper) to build the code from your /app and androidTest directories, respectively.
  3. See below for environmentSetup.sh Part II. Essentially, after building both the app and the test suite, we cannot continue without the emulator being ready. And so we wait.
  4. Once the emulator is up and running, we run gradlew connectedAndroidTest, which, as its name suggests, runs the tests on the connected Android device. If you’re using Espresso or other test libraries, those commands would go here.
    4a. The CircleCI Android docs say that the “standard” way to run your tests is through ADB – ignore them. Gradle is the future and it elides all of those thorny problems that ADB tests have.

Bash Round 2

As mentioned above, after Gradle has finished building your app and test suite, you’ll kind of need the emulator to…y’know…run your tests.

This script relies on the currently-booting AVD’s init.svc.bootanim property, which essentially tells us whether the boot animation has finished. Sometimes, it seems like it’ll go on forever…


Android AVD Boot


*will the madness never stop?!*

This snippet can go in the same file as your previous bash script – in that case, you only need one #!/bin/bash – at the top of your file.

1
2
3
4
5
6
7
8
9
10
11
12
13
#!/bin/bash

function waitAVD {
    (
    local bootanim=""
    export PATH=$(dirname $(dirname $(which android)))/platform-tools:$PATH
    until [[ "$bootanim" =~ "stopped" ]]; do
      sleep 5
      bootanim=$(adb -e shell getprop init.svc.bootanim 2>&1)
      echo "emulator status=$bootanim"
    done
    )
}

Note: This script was adapted from this busy-wait script.

Results

By default, CircleCI will be fairly vague regarding your tests’ successes and/or failures. You’ll have to go hunting through the very chatty verbose Gradle loggings in order to determine exactly which tests failed. Fortunately, there’s a better way – thanks to Gradle!

When you run gradlew connectedAndroidTests, Gradle will create a folder called /build/outputs/reports/**testFolderName**/connected in whichever folder you have a build.gradle script in.

So, for example, if your repo was in ~/username/awesomerepo, with a local library in awesome_repo/lib and an app in /awesome_repo/app, the Gradle test artifacts should be in /awesome_repo/app/build/outputs/reports/**testFolderName**/connected.

In this directory, you’ll find a little website that Gradle has generated, showing you which test packages and specific tests passed/failed. If you like, you can tell CircleCI to grab this by placing the following at the top of your circle.yml file:

1
2
3
general:
  artifacts:
    -/home/ubuntu/**repo_name**/build/outputs/reports/**testFolderName**/connected

You can then peruse your overwhelming success under the Artifacts tab for your CircleCI build – just click on index.html. It should pull up something like this:


Example Artifact

Security, Signing, and Keystores

The astute among you will notice that I haven’t gone much into the process of signing an Android app. This is mainly for the reason that people trying to set up APK signing fall into 2 categories – Enterprise and Simple.

Enterprise: If you’re programming Android for a company, you probably have some protocol regarding where your keystores/passwords can and cannot live – so a general guide such as this won’t be much help for you. Sorry.

Simple: You’re not Enterprise, so your security protocol is probably a little more flexible – i.e., you feel moderately comfortable with checking your keystore files into your respository.

In either case, Google and StackOverflow are your friends.

My final word of advice is that CircleCI can encrypt things like keystore passphrases – stuff you might consider passing in plain-text in your buildscript files. Check out CircleCI’s Environment Variables doc.

Finally

Go into your CircleCI settings, add a hook for your Github repo, and then do a git push origin branchName. If the Gradle Gods have smiled upon you, Circle should detect your config files and start building and testing!

Depending on your test suite, tests can take as little as a few minutes or as much as a half-hour to run. Try not to slack off in the meanwhile, but rejoice in having some solid continuous integration!

Stay tuned for a future blog post about using CircleCI to automagically deploy to MavenCentral!

Flipping to the back of the book…

Below is the full circle.yml as well as environmentSetup.sh for your viewing/copying pleasure:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# Build configuration file for Circle CI
# needs to be named `circle.yml` and should be in the top level dir of the repo

general:
  artifacts:
    -/home/ubuntu/**repo_name**/build/outputs/reports/**testFolderName**/connected

machine:
  environment:
    ANDROID_HOME: /home/ubuntu/android
  java:
    version: oraclejdk6

dependencies:
  cache_directories:
    - ~/.android
    - ~/android
  override:
    - (echo "Downloading Android SDK v19 now!")
    - (source scripts/environmentSetup.sh && getAndroidSDK)

test:
  pre:
    - $ANDROID_HOME/tools/emulator -avd testAVD -no-skin -no-audio -no-window:
      background: true
    - (./gradlew assembleDebug):
      timeout: 1200
    - (./gradlew assembleDebugTest):
      timeout: 1200
    - (source scripts/environmentSetup.sh && waitForAVD)
  override:
    - (echo "Running JUnit tests!")
    - (./gradlew connectedAndroidTest)

And the accompanying shell scripts:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#!/bin/bash

# Fix the CircleCI path
function getAndroidSDK(){
  export PATH="$ANDROID_HOME/platform-tools:$ANDROID_HOME/tools:$PATH"

  DEPS="$ANDROID_HOME/installed-dependencies"

  if [ ! -e $DEPS ]; then
    cp -r /usr/local/android-sdk-linux $ANDROID_HOME &&
    echo y | android update sdk -u -a -t android-19 &&
    echo y | android update sdk -u -a -t platform-tools &&
    echo y | android update sdk -u -a -t build-tools-21.1.2 &&
    echo y | android update sdk -u -a -t sys-img-x86-android-19 &&
    #echo y | android update sdk -u -a -t addon-google_apis-google-18 &&
    echo no | android create avd -n testAVD -f -t android-19 --abi default/x86 &&
    touch $DEPS
  fi
}

function waitForAVD {
    (
    local bootanim=""
    export PATH=$(dirname $(dirname $(which android)))/platform-tools:$PATH
    until [[ "$bootanim" =~ "stopped" ]]; do
      sleep 5
      bootanim=$(adb -e shell getprop init.svc.bootanim 2>&1)
      echo "emulator status=$bootanim"
    done
    )
}

References

Refactoring Git Branches - Part II

This is a follow-up to Refactoring Git Branches – part I.

While coding on a feature we almost always stumble across areas of the code base that should be improved along the way.

If such an improvement is a larger change, I usually make a note to do it separately right after the feature I am currently working on. This helps me stay efficient by focusing on one task at a time, and prevents me from getting lost in concurrent changes happening at the same time.

Sometimes, however, it is just easier and faster to implement a small improvement like a syntax fix along the way, and so my feature branches can still end up implementing several independent changes. Reviewing such branches with multiple changes is harder than necessary, since it is much easier to talk about different topics separately than all at once.

Before submitting pull requests for such bloated feature branches, I usually do a git reset master (assuming master is the branch from which I cut my feature branch). This undoes all the git commit commands that I have done in my branch, and I am left with all changes from my feature branch uncommitted and unstaged.

Now it is easy to stage and commit only the lines that are actually related to my feature. I can do this using the git command line directly (through interactive staging of hunks), or through GUI tools like GitX or Tower. Then I commit unrelated changes into their own feature branches, even if they are small.

If my main feature depends on the peripheral changes made, I would first commit the peripheral changes into the current branch, then cut a new branch from the current branch and commit the actual feature into it. This way, the feature can use the peripheral changes, but both can be reviewed separately. If we have a remote repo set up, since we are changing history here, we need to do a git push -f on our old branch to overwrite the old commits on the remote repo.

This results in pull requests that are super focused on exactly one thing and nothing else, and are thereby much easier and faster to review.

This technique obviously only works for private feature branches. Never change Git history that other people use!

iOS Checklists - Creating and Submitting Your App

Over the last seven years, our team at Originate has created countless iOS apps. Along the way we have continuously fine-tuned our iOS development process, creating best practices that we apply to every new app we build.

We’ve put some of this down on paper in the form of two checklists, one used when starting an iOS project, and one used when preparing to submit to the App Store.

Following these checklists has helped our team work together more efficiently, architect better solutions, reduce development time, and reduce the risks that come with publishing an app to the App Store.

We hope these checklists will do the same for you.

Starting an iOS project

  • Repo/GitHub

    1. Start by creating a new repo in GitHub and adding the appropriate iOS gitignore file.

    2. Be sure to follow the git-flow workflow, with master holding your App Store builds, dev your latest development code, and feature branches holding the current work in progress.

  • Xcode

    1. Make sure everyone in the team is on the same version of Xcode.

    2. Turn on “Analyze during build” and “Treat Warnings as Errors” (Set it under “Build Settings” in your application target).

    3. Turn off tabs and use spaces: XCode > Preferences > Text Editing > Indentation > Prefer Indent using Spaces, Tab width 2, Indent width 2.

  • Jenkins/OS X Server/TestFlight

    Setup CI/CD (Continuous Integration/Continuous Deployment) making sure with every push to dev all tests are run and an Ad Hoc build is created/emailed to the team (with the commit logs in the email body). If the build or tests failed, a failure email should instead be sent out to the team. At Originate most of our iOS projects use OS X Server integrated with TestFlight.

  • Coding Style/Standards

    1. Follow Apple’s recommended iOS coding style guide.

    2. In addition, follow recommendations here: http://qualitycoding.org/preprocessor/

    3. Keep your .h files short and simple, exposing only what is needed by other classes. Move all other methods, properties, instance variables declarations, etc. inside the .m file.

    4. Name your view controllers based on the page they are displaying (e.g. LoginViewController).

    5. Organize your project navigator and files using groups. Good group names are Data Models, Views, Controllers, App Delegate, Supporting Files, Tools, etc. Messy project navigators should not be accepted.

    6. Before submitting a pull request, go over the checklist at the bottom of our effective code review blog post and look for red flags.

  • Architecture

    1. MVC (Model View Controller) is sometimes jokingly called Massive View Controller in iOS development. These massive controllers that do everything are common mistakes for beginners. They are not acceptable. As needed, split out table related delegates/data sources from the view controller into their own separate classes. Split out views (especially if they are reused) into their own view classes. Make sure helper methods are pushed out into helper classes. In addition, the View Controller should not make any calls to the server, instead the Model or a manager class should handle this.

    2. Some good sample code/tutorials/patterns: Lighter View Controllers, Viper and BrowseOverflow (from the iOS TDD book).

  • Views/Nibs/Storyboards

    1. Be sure to use constraints/autolayout for your views and support different screen sizes. Otherwise you will have to manually configure the frame sizes/positions for each view to make sure it fits correctly for each screen size. PureLayout and FLKAutoLayout have been used on some projects at Originate.

    2. Determine if Nib files will be used. Originate recommends avoiding them, but leaves the decision to the Tech Lead. Storyboards are not used as they they are hard to manage with multiple developers (e.g. trying to merge a storyboard), slow down Xcode, and add a level of complexity to the code that is not necessary.

    3. Use FrameAccessor to modify frames if needed. This will make it very easy to get and set a UIView’s size and origin.

  • Fonts and Colors

    Standardizing fonts and colors throughout the app (i.e. create helper classes) so that it’s easy to maintain/modify them as needed. This also makes for cleaner code that doesn’t have random RGB or font strings scattered in the views.

  • Display text

    1. All strings displayed to the user must be placed in a localization file.

    2. Avoid using image assets that contain text, use a UILabel instead, pulling the text out of the localization file.

  • Analytics

    http://replay.io is our go-to platform for analytics at Originate.

  • Crash reporting

    Although Apple and iTunes Connect are making progress here, it’s still best to use a third-party tool like Crashlytics. They work much faster and have a better UI/reporting interface.

  • Add AOP support if needed (e.g. for logging).

  • Third party code dependencies

    Cocoapods can be used to maintain third-party dependencies. At Originate this decision is left to the Tech Lead.

  • Server communication

    1. Add the ability to toggle between server environments (e.g. QA, dev, staging, etc.) in app settings.

    2. If needed, implement an app update notification system, where the server can inform the app a newer version is available and the app can display the appropriate message to the user.

    3. Make sure an activity indicator is displayed during server calls. MRProgress has been used on some projects at Originate.

    4. AFNetworking or RestKit (if Core Data is needed) should be used for network communication. See our AFNetworking cache blog post for optimizations/configuration.

    5. For debugging, make sure all server communications (i.e. web service requests/responses) are printed out to the console.

    NOTE: Use DLog to log all requests and response (in debug mode) to the console:

1
2
3
4
5
6
  #ifdef DEBUG
  #define DLog(fmt, ...) NSLog((@"%s [Line %d] " fmt), __PRETTY_FUNCTION__, __LINE__, ##__VA_ARGS__);
  #else
  #define DLog(...);
  #endif
  
  • Polish

    Small tweaks to an app can go a long way. Make sure to include simple user expected features like pull-to-refresh, tap status bar for scroll to top, activity indicator during server calls, etc. to make the app user friendly. Your Design and Product team should be on top of this already. The iOS Human Interface Guidelines is a good reference doc for them.

Lastly, ensure that the process to create the needed accounts (e.g. iTunes Connect, Urban Airship, Analytics Accounts, etc.) is started.

Submitting to the App Store

Releasing to the App Store must not be taken lightly. Steps must be taken to ensure the submitted app follows Apple’s guidelines and will work once published to the store. Four to eight hours should be allocated to perform this task.

When your app has been tested and is ready to release, follow these steps:

  • Apple Guidelines

    1. At a high level, Apple goes over everything related to App testing, Ad Hoc distribution, and App Store release in the App Distribution Guide. This document should be reviewed.

    2. And of course, most importantly, these documents should be memorized! App Store Review Guidelines and Common App Rejections.

    NOTE: Although not listed in the rejections link above, we’ve found that often developers forget to add content flagging for apps that create user generated data. Apple rejects these apps until content flagging is added.

  • Core Data

    If you are submitting an updated app that uses Core Data, then you must make sure to write a migration script in order to properly propagate changes to the DB. If this is not done, then it’s possible that the app can crash once updated and the user will have to delete and reinstall the app. See Core Data Model Versioning and Data Migration.

  • App Review Screen

    When you want your app to be reviewed by the user, UAAppReviewManager makes sure the review is done at the right time.

  • Release Scheme

    In the “Edit Scheme” section of Xcode, “Archive” should be set to “Release”. Release builds must hide log statements and turn off any test code/test screens used for development.

    NOTE: Due to compiler optimizations, release builds can sometimes function differently than debug builds. It is best to start testing release builds a few days before the release to the App Store in order to capture any possible issues.

  • Server

    Point your release build at the proper production servers (and use HTTPS)

  • Release Candidate Ad Hoc builds / App Store build

    Make sure to use the correct Bundle Identifier, Apple ID, certificates (e.g. Push Certificates), etc. for the release candidate and App Store builds.

  • Test Release Candidate configuration

    1. Make sure the app is communicating with the correct production servers (using HTTPS).

    2. Make sure all test screens are hidden.

    3. Make sure no sensitive data is printed to the console.

    4. Make sure Analytics is working with the correct production account.

    5. Make sure services such as Urban Airship are working with the correct production account.

  • Push to master

    1. Update the bundle version number in the app plist file. This number does not have to match the version number you assign to the app on the App Store, but it’s best that it does.

    2. Push the code that is ready for release to the master branch and tag it.

  • Setup the app in iTunes Connect

    1. In iTunes Connect create a new App, or if this is an update, select “Add Version”.

    2. Make sure to fill in all data and upload all necessary images (and video).

    NOTE: Only the fields “Description” and “What’s New In This Version” can be modified once the app is accepted and placed in the store. Other important fields like images cannot be modified.

  • Submit

    Use Xcode to create the App Store build and upload to iTunes Connect.