CI improvements
- See if we can get a clear report of #failures
- Try to make py.test colors pass through tox and gitlab (for easier reading of results)
- Check that cancelled jobs are actually killed (looks like they still complete in the background and prevent other jobs from starting)