It has been a while since my last post here. Lots of things changed in 2011. The big change happend in March when I joined ThoughtWorks Brazil and moved to Porto Alegre. From March to November (8 months) I worked in a distributed team between Brazil and EUA. During the project, we as a team had the opportunity to learn lots of good practices together and this post has the objective to share them with the community. All the topics below came up in team retrospective at the end of the project and correspond to team’s lessons learn from that time.
P.S.: Because I didn’t write down the whole discussion at the time of the retrospective, some arguments may not represent team’s opinion, but my personal one.
1. Visit your client
Because our team were distributed, people were frequently flying back and forth between our office and the client. Some people may think this strategy to expensive to afford (flights, hotels, meals) and it is in fact. However, the success of a project is tightly related to team members relationship. In co-located teams it is easier to build trust relationships between team members because they see each other all the time, go lunch together, hang out in the weekends. That is not true in a distributed team. It’s crucial that team members had some face-to-face meetings in a while, preferably at the beginning of the project. Believe me, this approach will actually same some money. So every distributed project must have a budget intended for members travel.
2. Remote pairing is hard
ThoughtWorkers love pairing. I would say that our team in Brazil paired in 90% of the tasks. We do pairing to share knowledge between team members, rump up new team members, write better code. We had a strategy of dedicating two brazilian developers to pair with two devs in the US through Skype and LogMeIn. The idea behind that was to balance expertise between both locations. For example, we had specific parts of the application that only on side of the team knew about and we want to eliminate such risk. Unfortunately, we found out that remote pairing is hard. The main reason? It slows development down. In our experience, the challenges in remote pairing were (1) the connection dropped frequently breaking development flow (2) typing in a remote box is slow which resulted in devs having to express their solution through conversations instead of code (3) explain how to code a solution is much harder than implement it yourself. Due to all these challenges we decided to do remote pairing only in very specific scenarios.
3. Use real data/users before going live
During eight months we had four main releases. The frist one was a success which translates to happy users, client and team. Unfortunately, over time users started complaining about application slowness, unpredictable results and data security faults. So what we did wrong? Just after the release the system had almost no data so pages loaded fast, bad data didn’t exist, users didn’t have time to break the system yet. After 2 months we had tons of data inserted and as a consequence application’s report started to take a while to load. In addition, users posted all kinds of data and found out new flows that were never tested. The problem: we didn’t stress the application with real data/users. To solve these problems we basically scheduled user sessions meetings frequently so we could learn how they use the system and dumped production like data in our staging environment. After that we could find problems before they appear in production and solve it as part of our normal process flow.
4. Write acceptance tests
I use to write lots of unit tests before joining ThoughtWorks (I already had TDD mentality). However, I’ve never written acceptance tests. So every release was painful: manual testing took too long and team didn’t know for sure what could break. In my first ThoughtWorks project a user story is only Dev Done after all acceptance scenarios were implemented. Basically, before the story is Ready to Dev, a business analyst sits down with PO and QAs to write scenarios in a Given/When/Then format. That would drive our development so all scenarios were implemented. After that, we were sure that the story was really done. In addition, those scenarios were added to our build pipeline, working as a safety net for developers and preventing us to run tons of manual testing when realeasing. With good acceptance tests coverage, we were able to release fast and with confidence.
5. Keep your build green
In our team we had a continuous integration server responsible for checking-out the source code, compiling the solution, running static code analysis, running automated tests and building deployment artifacts. All these tasks were triggered after every check-in. The continuous integration server helped us by integrating devs work into the final solution and checking for possible errors. A clear example is when someone forgets to check-in source code files or breaks existing automate tests. When that happens the build becomes red indicating the application is in a bad state. We as a team defined a rule that whenever the build is red, a pair would stop whatever they were doing to fix it. Meanwhile no other dev should check-in. Because our team was quite big if others keep checking-in code into the repository, fixing the build becomes harder. This rule helped us keeping our build green as much time as possible which means the application were most of time in a good state, ready to be deployed to staging for example.
6. Insert data for each test
Although acceptance tests are important when developing large software applications, they introduce a big problem: maintenance. Our team learned how expensive is maintaining acceptance test due to their non-determinism. One of the reasons for that was lack of isolation. Right in the beginning, the acceptance test suite was built to insert all necessary data before running all tests and new test data was added to a SQL script file as needed. After a while tests started to share data like users, users group and other sensitive information. If any other test modifies them, other tests that rely on that data would fail unexpectedly. One solution we found was inserting data before each test. Instead of having a single script with all test data, we programmatically added only the necessary data for a single test to run. Although it increased test running time, this change brought more confidence in the build: when the build failed the team knew that some feature was broken.
7. Group communication channel
I have already said that we were a distributed team and communication was a big challenge. One strategy that really helped our team communication was the creation of an open communication channel were all team members were connected all the time. In our case, we had a Skype group conversation for that and we used it with no restrictions to inform what is going on in the project. Instead of privates and parallels conversations we really recommended using the chat to inform when a user story is ready for QA or blocked, the build broke, the meeting was cancelled and even discuss technical issues. In that sense all team members were in the same page about project’s event.
I am pretty sure that we as team learned a bunch of other lessons. However, those were the ones that we highlighted in our retrospective. I hope that this list will be useful to you someday. If you have any other topic to add in that list, please fell free to do it as a comment.