One of the most intriguing stories published about culture’s influence on collaborative engineering projects can be found in “The Mythical Man-Month” by Frederick P. Brooks. His book is a classic within Computer Science and also goes by the moniker “TMMM”.
Numerous empirical studies on software engineering projects have referenced TMMM, and almost every research paper in the relatively new Empirical Software Engineering [ESE 2011] discipline within computer science, starts with quoting it. Also many of the key figures in the DevOps movement are pointing to this 40+ year old book-How come?
Published in 1975, TMMM still provides valuable insight into why certain software projects are destined to fail from the beginning and what we can do to avoid a letdown.
A prime example showcasing the success which can be obtained by applying the knowledge in this book can be found through Microsoft Windows Vista and Windows 7. Microsoft Research’s ESE used Brook’s wisdom to study what impacts quality of software produced by globally dispersed, culturally diverse teams. They not only studied, but applied their findings and the success attained by Windows 7 far exceeded the negativity surrounding Windows Vista.
Let’s take a look at TMM Chapter 7 “Why Did the Tower of Babel fail?”
According to the Genesis account, the tower of Babel was man’s second major engineering undertaking, after Noah’s ark. Babel was the first engineering fiasco. The story is deep and instructive on several levels. Let us, however, examine it purely as an engineering project, and see what management lessons can be learned.
How well was their project equipped with the prerequisites for success?
Did they have:
- A clear mission? Yes although naively impossible. The project failed long before it ran into this fundamental limitation.
- Manpower? Plenty of it.
- Materials? Clay and asphalt are abundant in Mesopotamia.
- Enough time? Yes, there is no hint of any time constraint.
- Adequate technology? Yes, the pyramidal or conical structure is inherently stable and spreads the compressive load well.
Clearly masonry was well understood. The project failed before it hit technological limitations.
Well, if they had all of these things, why did the project fail? Where were they lacking?
In two respects — communication, and consequently, organization.
They were unable to talk to each other, which led to a failure in coordination and a subsequent break in workflow.
From these events, we gather that lack of communication led to disputes, bad feelings, and group jealousies. Shortly after, the clans began to move apart, preferring isolation to wrangling.
The Tower of Babel project failed due to lack of collaboration. Cultural differences between the teams working on the project led to a lack of communication and consequent lack of organization required to get the job done.
So how does this relate to large global corporations and their employees? How does it relate to different “corporate clans” like IT, Marketing, Product, Sales, etc. which are eventually spread across the globe, featuring many cultural backgrounds?
According to Conway’s Law, “Organizations are limited to produce artifacts that reflect their communication structure”.
One cannot improve communication without imposing changes in organization. Simply put:
Communication and organization are two sides of the same coin.
Good collaboration tools can ease the situation, but cannot resolve the core problem: One cannot expect good communication within an inadequately structured organization.
A globally dispersed work force operating in hierarchical or matrix structures becomes hopelessly inefficient and ineffective over time. The US American corporate “best practice” to re-shuffle reporting lines every other year does not improve the situation.
For an excellent discussion on this topic, I can only advise to read “Boiling Frogs”, a GCHQ research paper on software development and organisational change in the face of disruption [GCHQ 2016]. According to this paper, many large global enterprises are running “dysfunctional organizations.”
When Microsoft Researchers found that physical distance doesn’t affect post-release fault rates, but distance in the organizational chart does [Nagappan et al (2008), Bird et al (2009)], Microsoft changed the structure of its development organization, which led to a superior quality product, Windows 7.
What would be needed to conduct similar empirical studies at your own corporation to find the optimal organizational structure to drive down post-release fault-rates in your software systems? And what do you do to ensure the clans in your globally dispersed work force keep talking to each other? (Remember: tools are not the point!) And are these observations really limited to collaborative engineering projects in the field of software engineering?
I firmly believe that the forces that brought DaimlerCrysler, GlobalOne, AOL Time Warner, Sprint Nextel and other merges of unequal corporate cultures down, follow the exact same pattern, or as Peter Drucker put it: “Culture eats strategy for breakfast.”
Admittedly. Nobody gives a damn about what I believe (not even my kids), but the science behind Amazon’s 2 pizza teams [PIZZA 2014] and Spotify’s evolving model how to scale agile across globally distributed software development teams [SPOTIFY 2012], appear to have stipulated behemoths like IBM to have a crack at this as well. They started their journey into the space of Agile Enterprise [IBM 2016], and it follows the patterns mentioned above.
Can this ever happen in your company?
[ESE 2011] “Empirical Software Engineering” in: American Scientist, 2011. http://www.americanscientist.org/issues/feature/2011/6/empirical-software-engineering/1
[GCHQ 2016] GCHQ: Boiling Frogs? Technology organizations need to change radically to survive increasing technical and business disruption https://github.com/gchq/BoilingFrogs
[Nagappan, et al (2008)] The Influence of Organizational Structure On Software Quality: An Empirical Case Study, January 1, 2008. https://www.microsoft.com/en-us/research/publication/the-influence-of-organizational-structure-on-software-quality-an-empirical-case-study/
[Bird, et al (2009)] Does distributed development affect software quality? An empirical case study of Windows Vista, August 1, 2009. https://www.microsoft.com/en-us/research/publication/does-distributed-development-affect-software-quality-an-empirical-case-study-of-windows-vista/
[SPOTIFY 2012] Scaling Agile @ Spotify with Tribes, Squads, Chapters & Guilds Henrik Kniberg & Anders Ivarsson https://dl.dropboxusercontent.com/u/1018963/Articles/SpotifyScaling.pdf
[IBM 2016] A Corporation as Big as a Small Country: Towards an Agile Enterprise. https://www.infoq.com/presentations/ibm-agile