Financial industry response to new regulations is always interesting and the Dodd-Frank Act is no exception. Peel back the brouhaha, however, and one discovers a mission statement you would think financial institutions would want to embrace:
“Reduce risk, increase transparency, promote market integrity.”
The challenge, it seems, is implementing regulatory compliance while encouraging innovation, which at first glance represents a dichotomy. The cost of stricter regulations, however, is a tax on inefficiency. Effectively, compliance encourages firms to:
- Assess both the design and operating effectiveness of internal controls
- Understand the flow of transactions including technological aspects
- Evaluate controls designed to prevent or detect fraud and other risks
- Rely on management’s work based on competency and transparency
- Scale requirements considering the size and complexity of the company
Does Regulation Substitute or Complement Governance
When all is said and done, regulatory requirements comes down to data management. Legislation like Sarbanes-Oxley and Dodd-Frank have ushered in the necessity of adopting a data governance program to align information accountabilities amongst stakeholders, and to foster intelligent collaboration between the business and technology.
“Data governance is a set of processes that ensures that important data assets are formally managed throughout the enterprise. Data governance ensures that data can be trusted and that people can be made accountable for any adverse event that happens because of low data quality. It is about putting people in charge of fixing and preventing issues with data so that the enterprise can become more efficient. Data governance also describes an evolutionary process for a company, altering the company’s way of thinking and setting up the processes to handle information so that it may be utilized by the entire organization. It’s about using technology when necessary in many forms to help aid the process. When companies desire, or are required, to gain control of their data, they empower their people, set up processes and get help from technology to do it.”Key is providing checks and balances between those who create/collect information, and those who consume/analyze information. In any enterprise, much less a large institution, this is not an easy task.
Some stakeholders are concerned with operational systems and data; while others care mostly about analysis, reporting, and decision-making. In fact, the needs of stakeholders who are concerned about data quality and controlling access to information may conflict with stakeholders who want to increase the ability to acquire and share content, records, and reports. In addition, these needs must consider risk management, data security, and legal issues. To make matters more complicated, stakeholders tend to have different vernaculars to describe their assumptions, requirements, drivers, and constraints.
The question is how to best implement data governance within an organization? It is one thing for a company to desire or be required “to gain control of their data,” but it is all together another issue to “empower their people” and do it in practice.
The answer to the above question may exist in applying Agile/Scrum methodologies and scaling the agile mindset across the enterprise by implementing a matrix organization.
Agile/Scrum Basics and Empirical Control Processes
For those not familiar with the term, agile is a “lightweight” methodology based on iterative and incremental development, where requirements and solutions evolve through collaboration between self-organizing, cross-functional teams. The approach evolved in the 1990s as a reaction against “heavyweight” methods, which are characterized as regimented and micromanaged, i.e., the traditional waterfall model of development.
Adaptive methods (change driven approach) focuses on adapting quickly to changing realities. Predictive methods (plan driven approach), in contrast, relies on analysis and planning, and often institutes a change control process to try and manage the project’s outcome. The irony with the latter method is that planning for delivery of requirements often takes place when the least is known about the best possible solution, and how to implement the desired outcome (ref: cone of uncertainty).
Figure 1. Iron Triangle Waterfall / Agile Paradigm Shift
That said, if both the problem and solution are known, a waterfall approach may be more suitable. However if there are unknowns, an agile approach allows incremental maturation and implementation. In reality, methods exist on a continuum from adaptive to predictive; therefore, the best method for managing a project depends on the context of the situation.
Scrum is one form of agile that works well when the problem is known but the solution is unknown, such as in software development. Kanban, on the other hand, generally works well in an operations environment where we know what skills are involved, but not the scope of the work itself. In situations where there are unknown unknowns, such as in research and development, Lean Startup helps facilitate experimentation and validated learning.
Figure 2A. The Spectrum of Process Complexity
Figure 2B. Development Methods Based on Degrees of Uncertainty
In this article we focus on how to scale Agile/Scrum to implement a data governance program. Scrum itself is a framework within which people can address complex adaptive problems, while productively and creatively delivering products of the highest possible value. Scrum is lightweight, simple to understand, but difficult to master well.
Review of Scrum Basics 
Scrum consists of Scrum Teams and their associated roles, events, artifacts, and rules. It employs an iterative, incremental approach founded on three pillars—transparency, inspection, and adaptation—to optimize predictability and control risk.
The Scrum Team is made up of a Product Owner, the Development Team (6 ±3 members), and a Scrum Master. Teams are full-time, self-organizing, and cross-functional. Self-organizing teams choose how best to accomplish their work, rather than being directed by others. Cross-functional teams have all the competencies needed to accomplish the work without depending on others not part of the team. The team model in Scrum is designed to optimize flexibility, creativity, and productivity.
Core Scrum prescribes four events: Sprint Planning, Daily Scrum, Sprint Review, and Sprint Retrospective. The heart of Scrum is a Sprint, a time-box of one month or less during which a useable and potentially releasable product Increment is created.
Sprints have consistent durations throughout a development effort. A new Sprint starts immediately after the conclusion of the previous Sprint. Other than the Sprint itself, which is a container for all other events, each event in Scrum is a formal opportunity to inspect and adapt the product being developed or improve the work process.
Figure 3. Source: Roger W. Brown, MS, CSC, CST http://www.agilecrossing.com/ 
Scrum’s artifacts represent work that is useful in providing transparency and opportunities for inspection and adaptation. The Product Backlog is an ordered list of everything that might be needed in the product, and is the source of requirements for any changes to be made to the product. The Sprint Backlog is a subset of Product Backlog items (i.e., requirements) selected for the Sprint plus a plan for realizing the Sprint Goal.
The Sprint Backlog defines the work the Development Team will perform to turn Product Backlog items into a “Done” Increment. The Development Team tracks remaining work for every Daily Scrum and modifies the Sprint Backlog throughout the Sprint. In this way, the Product Increment emerges as the Development Team works through the plan and learns more about the work needed to achieve the Sprint Goal.
Elements of a Successful Data Governance Framework
Because data management in an organization is a non-linear problem, data governance and stewardship lends itself to an agile method. For example, different stakeholders will have different priorities—some are mainly concerned with data quality or business intelligence, while others are focused on policy enforcement, access control, and/or setting standards.
In order to accomplish successful data governance, clarification of the “why, what, who, where, when, and how” needs to evolve through collaboration across the enterprise. Hence, a data governance program should be designed holistically, taking into consideration the interdependencies of data definitions, standards and rules, decision rights, accountabilities, and process controls. Within the context of a typical enterprise, the complexity of such a design requires a structure and methodology to guide the effort, and generate worthwhile results.
Program delivery structure
Borrowing from Project Management Institute’s (PMI) concept of a project management office (PMO), a data governance office (DGO) serves as the data governance's organizational body with responsibility for overseeing the initiative and related communications. Because data governance may involve a collection of related projects managed in a coordinated way, the various project workstreams resemble what PMI describes as program management.
This is not to say we are advocating a traditional project management approach. As discussed in the next section Agile/Scrum can be scaled across an enterprise to achieve data governance and stewardship objectives. In this context, a Product Owner’s role is to maximize the value of work done on behalf of the DGO. To help guide the DGO’s efforts, core Scrum can be extended to include a vision statement, product roadmap and release planning. See The Roadmap to Value.
Just like any other meaningful effort, a business case for data governance should articulate the benefits of the project, align the project to enterprise strategy, and justify use of resources. A Vision Statement is owned by the DGO/Product Owner and provides context for how current development supports the overall goal. In effect, it aligns the objectives of data governance with the business case and sets strategic expectations.
A Product Roadmap is a statement of intent that is credible and potentially achievable. It is a planned future laid out in broad strokes and outlines proposed release themes. Typically a Product Roadmap lists high level functionality targeted within a quarterly period, and extends two to four significant feature releases into the future. They are not, however, commitments. They are “desirements” for the future given what is known today.
Release Planning identifies the highest priority features and provides a focal point for the Scrum Team to mobilize around, called a Release Goal. Release Goals may consist of several Sprints with the Release Sprint coming last. Release Planning is the process by which a Product Backlog is managed, and sets up Stage 4 which is Sprint Planning.
Program content development
While the above discussion provides a “structure” and methodology in which to deliver a data governance program, it does not provide a framework to develop and manage the “content” of such initiative.
Scrum is ideal for new product development such as software. Conversely, the “product features” that data governance is seeking to “release” primarily revolves around Business Process Management (BPM). And while technology systems may need to be developed, implemented, and/or upgraded in order to achieve the goals of a data governance and stewardship initiative, recall Steve Sarsfield’s (2009) definition:
“Data governance is a set of processes that ensures that important data assets are formally managed throughout the enterprise. Data governance ensures that data can be trusted and that people can be made accountable for any adverse event that happens because of low data quality. It is about putting people in charge of fixing and preventing issues with data so that the enterprise can become more efficient.”With Sarsfield’s definition in mind, the development of a data governance initiative is largely based on business analysis. The International Institute of Business Analysis’ (IIBA) “Business Analysis Body of Knowledge” (BABOK) provides a holistic and flexible framework around which a data governance initiative can be mapped.
Figure 5. Adapted from graphic based on IIBA BABOK diagram 
Leveraging the BABOK framework, the DGO designs, governs, and manages the data governance program through the establishment of a governance backlog, rules of engagement, performance controls, and communications management.
A Governance Backlog consists of Epics (e.g., Use Cases) captured as modified User Stories using the “ABC” approach. That is, if we do A, then we can expect B, which should lead to C. The “content” of these Epics should cover at least the following:
- Policies and guidelines (including regulations)
- Stakeholder accountabilities and decision rights
- Requirements (business, stakeholder, functional, non-functional, transition)
- Data standards (rules and definitions including metadata)
- Data processes and controls (gap analysis, solution approach)
- Data risk management (legal, audit and compliance)
- Data repositories and technology systems (systems of record)
- Critical success factors (definition of “Done”)
In the scaled Agile/Scrum structure contemplated in this article, the DGO, other than design, does not perform the work of developing and deploying the program, but participates by driving, guiding, and governing through various mechanisms.
Rules of Engagement are a key consideration in a robust data governance program. Who has the right to make what types of decisions and when, including protocols that should be followed, needs to be understood by all stakeholders. This can be formal or informal. Related to Rules of Engagement is Stakeholder Accountabilities which involves formalizing responsibilities for program participants. Issues with data can often be traced to unclear accountabilities somewhere along the data process flow. Any time new processes, practices or controls are introduced, stakeholder accountability needs to be assigned.
Performance Control is another key responsibility of the DGO, and is necessary to ensure that data governance activities are transparent and executed as effectively as possible. This task involves determining which metrics will be used to measure the work performed by program participants, keeping in mind the Agile principle of “barely sufficient” rather than “gold-plating” to stay practical and efficient.
“The more robust a process or tool, the more you spend on its care and feeding, and the more you defer to it. With people front and center, however, the result is a leap in productivity.”Last but not least, Communications Management is critical in aligning stakeholders and evolving a common understanding. Because stakeholders represent people from different backgrounds and business domains, robust two-way communications is the most challenging aspect of a data governance program. As we shall see in the next section, a scaled Agile/Scrum approach is well designed to help facilitate two-way communications and accountabilities.
Scaling Agile/Scrum to Implement Data Governance
The perception in most organizations is that an investment in greater humanity reduces the bottom line, and an increase in business performance takes a toll on the humanity of people at work. Agile inverts this relationship by recognizing that in the knowledge economy, creating healthy workplaces in which people are inspired to deliver solutions is critical to engaging talent, delivering solutions, and gaining a competitive advantage.
The Agile Manifesto reads, in its entirety, as follows:
We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value:
That is, while there is value in the items on the right, we value the items on the left more.
- Individuals and interactions over processes and tools
- Working software over comprehensive documentation
- Customer collaboration over contract negotiation
- Responding to change over following a plan
With the above in mind, successful data management boils down to individuals and interactions over processes and tools. Realize, data is meaningless unless transformed into information that people put into context and use in a meaningful way.
Dependency Management in a Large Agile Environment
Data stakeholders come from across the organization and include people and systems that create and use data, and those who set rules and requirements for data.
Some stakeholders may be concerned with all of the data in a particular domain, while others will only care about a limited data set over which they assume ownership. Likewise, a data architect may be concerned about metadata considerations, while audit and compliance care mostly with who has access and control. Another consideration is stakeholders being left out of the loop on data-related decisions. Other participants, such as technology, may serve as proxies for actual stakeholders.
Aside from these issues, the majority of enterprises are not disciplined in how they track information and do not assign accountabilities for managing specific data assets.
Be that as it may, organizations typically have data stewards along various points within the data process flow. Stewards are often not formally assigned accountability but rather take on the responsibility of managing data pursuant to their own needs. These are the people that, in normal course of action, are involved in the definition, production and usage of data. Not only are they involved, they are de facto the people making decisions about data, whether responsibility is formally assigned or not.
While data stewards play an integral role in data governance, the responsibility for conducting and managing stakeholder analysis, and making sure all data stewards are represented appropriately and accountable is the DGO. In addition, the DGO needs to articulate and advocate the value of data governance and stewardship activities, and provide ongoing stakeholder communication, access, record-keeping, and education (CARE). Utilizing soft-skill to influence alignment of knowledge and practices, the DGO/Product Owner serves as a liaison from stakeholders to data stewards and vice-versa.
The remaining question, then, is how can a DGO/Product Owner practically manage such a complex process, especially in an organization that embraces an Agile/Scrum approach.
Matrix organization model
On of the more exciting challenges for Agile enthusiasts is scaling such practices across an enterprise. Having recently been involved in a project involving many of the same data management concerns described above, Kniberg and Anders’ (2012) white paper on “Scaling Agile @ Spotify with Tribes, Squads, Chapters & Guilds” is inspiring.
The model described represents a matrix organization (see Figure 6), and suggests a means by which to design a robust data governance and stewardship program. These concepts can also be applied to a traditional enterprise for data governance.
Figure 6. Based on Kniberg and Anders (2012). “Scaling Agile @ Spotify”
The basic unit of development is a Scrum Team called a “Squad”. Each Squad is a self-contained, cross-functional, self-organizing team with the skills needed to execute and be responsible for a long-term mission aligned to the goals of a business case. Because each Squad focuses on one mission related to a business case, they become experts in that area.
Squads are aggregated into collections called “Tribes” which work in related areas. Each Tribe has a Tribe Lead who is responsible for providing the best possible habitat for the Squads within that Tribe. Squads in a Tribe work best if physically located in the same office so as to promote collaboration between the Squads. Tribes hold gatherings on a regular basis where they show the rest of the Tribe what they have delivered, what they are working on, and what others can learn from what they are doing. The white paper describes Tribes as being “incubators” for Squads, which is designed to feel like a mini-startup.
The downside to autonomy is a loss of economies of scale. For example, a tester in Squad A may be wrestling with a problem that the tester in Squad B solved last week. To mitigate this issue, and to better integrate the enterprise and foster better communications, Chapters and Guilds are formed to glue the organization together. This structure provides for some economies of scale without sacrificing too much autonomy.
Figure 7. Based on Kniberg and Anders (2012). “Scaling Agile @ Spotify”
A Chapter is a family of people who have similar skills working within the same general competency area within the same tribe. Each Chapter meets regularly to discuss their area of expertise and their specific challenges. Leading the Chapter is a supervisor with traditional functional manager responsibilities called a Chapter Lead. As explained by the white paper, the Chapter Lead at Spotify is also part of a Squad involved in day-to-day work to help “ground” the role.
A Guild is a community of interest consisting of people that cuts across Tribes and the organization, and who want to share knowledge and practices. Each Guild has a “Guild Coordinator,” and may include all related Chapters working within a competency or functional area. However, anybody who is interested in the knowledge being shared can join any Guild.
In terms of a matrix organization, think of the vertical dimension as the “what” and the horizontal dimension as the “how”. The matrix structure matches the “professor and entrepreneur” model, where the Product Owner is the “entrepreneur” or “product champion” focusing on delivering a great product, while the Chapter Lead is the “professor” or “competency leader” focusing on technical excellence.
There is a healthy tension between these roles, as the entrepreneur tends to want to speed up and cut corners, while the professor tends to want to slow down and build things properly. The matrix structure ensures that each squad member can get guidance on “what to build next” as well as “how to build it well”. Both aspects are needed.
Caveat! The risk with this model is that the architecture of a system is exposed to issues and risks if nobody focuses on the integrity of the system as a whole. This concern is akin to the problem facing data governance across an organization.
To mitigate this risk, the white paper discusses a role at Spotify called “System Owner” which is recommended to consist of a developer-operations pair in order to benefit from both perspectives. The System Owner is not a bottleneck or ivory tower architect, but a “go-to” person(s) for any technical or architectural issues. Responsibilities include subject matter expertise, coordination, documentation, stability and scalability. To coordinate work on high-level architectural issues across multiple systems, there is also a Chief Architect role. This role reviews development of new systems to make sure common mistakes are avoided, and that systems are aligned with the architectural vision.
DGO Scrum of Scrums
Figure 8 illustrates the concept of leveraging the Spotify matrix organization, and scaling Agile/Scrum to implement data governance whereby the DGO acts in the capacity of a “Scrum of Scrums”. For example, the DGO Squad may be composed of Guild Coordinators (GC) from the data steward guild (yellow) and data stakeholder guild (green), Chapter Leaders (CL) involved in data management concerns, as well as the Chief Architect (CA). Note: in this example Systems Owners (SO) are represented vis-à-vis the data steward guild and operations architecture guild (blue).
Figure 8. Adapted from Kniberg and Anders (2012). “Scaling Agile @ Spotify”
One of the major concerns with large complex organizations is dependencies. In fact, a key reason for creating a data governance and stewardship program is to solve data dependencies that block or slow progress, as well as improve data quality and internal controls. A major responsibility of the DGO is to inventory and manage data dependency requirements, as well as manage traceability—that is, create and maintain a mapping of data relationships between stakeholders.
The key focus of the DGO Squad is to identify and resolve dependencies throughout the organization with respect to data management issues (red lines). A common source of dependency issues at many companies is development versus operations (blue lines) where a “handoff” occurs with associated friction and delays.
The white paper describes how Spotify minimizes this friction by having the operations’ job be primarily a support function (in the form of infrastructure, scripts, and routines) whereby the Development Squads release the code themselves. The issue we have with this approach is that it may not stand up to scrutiny under SOX, especially in a financial institution. Instead, we recommend that, akin to the DGO Scrum of Scrums, a Dev-Ops Scrum of Scrums smooth the road to production releases.
Figure 9. Adapted from Kniberg and Anders (2012). “Scaling Agile @ Spotify”
The key in any organization is finding the right balance. On one hand the goal is to have teams to work as autonomously as possible in self-contained, cross-functional, self-organizing Squads. On the other hand, without practices to bridge each Squads’ work, it becomes problematic to eliminate cross-team dependencies. The model described above should help facilitate robust data governance practices and even lead to innovation.
Some Concluding Thoughts on Data Governance
The Dodd-Frank Act was enacted to reduce risk, increase transparency, and promote market integrity within the financial market system. Contemplate that…
Scrum is founded on empirical process control theory and utilizes an iterative, incremental approach to optimize predictability and control risk. Empiricism asserts that knowledge comes from experience, and making decisions based on what is known. The three pillars of empirical process control are: transparency, inspection, and adaptation.
The Dodd-Frank Act is effectively advocating continuous improvement. The key to continuous improvement is constant optimization of data into meaningful information. Quality data reduces risk, increases transparency, and promotes market integrity. How is this accomplished? By implementing a data governance and stewardship program.
Because information doesn’t exist in a vacuum, a robust data governance and stewardship program must, at its core, be about empowering people in an enterprise to create solutions. This is how raw data becomes meaningful information.
The effort touches upon a range of disciplines encompassing two facets. The first is tangible, sensible, and definable—getting stuff done. These tangibles include enterprise architecture, data architecture, data management, metadata management, information technology, corporate and project management.
The other facet is the human side—the intangibles. Understanding intangibles is about anthropology, communications, quality, risk, and governance. It’s about individuals and interactions over processes and tools.
Henrik Kniberg and Anders Ivarsson (2012). “Scaling Agile @ Spotify with Tribes, Squads, Chapters & Guilds” October 2012. Source: http://blog.crisp.se/2012/11/14/henrikkniberg/scaling-agile-at-spotify
David A. Becher and Melissa B. Frye (2010). “Does Regulation Substitute or Complement Governance?” (August 20, 2010). Journal of Banking and Finance, Forthcoming. Available at SSRN: http://ssrn.com/abstract=1108309
Eric Babinet and Rajani Ramanathan (2008). “Dependency Management in a Large Agile Environment” pp.401-406, Agile 2008.
Steve Sarsfield (2009). The Data Governance Imperative a Business Strategy for Corporate Data. Ely: IT Governance Pub.
Gwen Thomas. “The DGI Data Governance Framework” The Data Governance Institute.
Ken Schwaber and Jeff Sutherland (2011). “The Definitive Guide to Scrum: The Rules of the Game” Scrum.org, October 2011.
International Institute of Business Analysis (2009). A Guide to the Business Analysis Body of Knowledge (BABOK guide), Version 2.0. Toronto, Ont: International Institute of Business Analysis.
Project Management Institute (2008). A Guide to the Project Management Body of Knowledge (PMBOK guide), Fourth Edition. An American National Standard, ANSI/PMI 99-001-2008.
 Federal Register / Vol. 77, No. 68 / April 9, 2012 / Rules and Regulations (77 FR 21278) I. Background. “Title VII of the Dodd-Frank Act amended the Commodity Exchange Act to establish a comprehensive new regulatory framework for swaps. The legislation was enacted to reduce risk, increase transparency, and promote market integrity within the financial system…”
 Podcast: “Perspectives on Sarbanes-Oxley Compliance – Where Companies are Saving Costs and Achieving Greater Efficiencies” Source: http://www.protiviti.com/en-US/Pages/PodcastDetail.aspx?AssetID=16
 Sarsfield, Steve (2009). The Data Governance Imperative a Business Strategy for Corporate Data. Ely: IT Governance Pub.
 Ken Schwaber, Jeff Sutherland (2011). “The Definitive Guide to Scrum: The Rules of the Game” Scrum.org, October 2011.
 Roger W. Brown. “Introduction to Scrum V 1.3 Revised 2012” AgileCrossing.com.
 Mark Layton (2012). Agile project management for dummies. Hoboken, N.J.: Wiley. http://platinumedge.com/
 International Institute of Business Analysis (2009). A Guide to the Business Analysis Body of Knowledge (BABOK guide), Version 2.0. Toronto, Ont: IIBA.
 Ibid. Layton (2012).
 “In an Agile environment, the development team is ‘the talent’.” Ibid. Layton (2012).
 Gil Broza (2012). The human side of Agile: how to help your team deliver. Toronto: 3P Vantage Media.
 Agile Manifesto Copyright 2001: Kent Beck, Mike Beedle, Arie van Bennekum, Alistair Cockburn, Ward Cunningham, Martin Fowler, James Grenning, Jim Highsmith, Andrew Hunt, Ron Jeffries, Jon Kern, Brian Marick, Robert C. Martin, Steve Mellor, Ken Schwaber, Jeff Sutherland, Dave Thomas. This declaration may be freely copied in any form, but only in its entirety through this notice.
 PMI PMBOK describes this structure as a “projectized organization”.
 Scrum recommended practices states that there should be one Product Owner and one Scrum Master per Scrum Team. Unfortunately, in practice, this is often not the case. Companies sometimes combine the role of Scrum Master and Product Owner, which is inadvisable. Other organizations spread Scrum Master and/or Product Owner amongst multiple teams, which is a slightly better practice. Figure 6 illustrates one Product Owner assigned full time to each Squad, with a Scrum Master assigned to two teams; however, we are not necessarily advocating this approach, although it may be considered better than other “real world” practices. Interestingly, as described in the white paper Spotify assigns a Product Owner to each Squad, but does not formerly assigned “squad leader” (i.e., Scrum Master). Rather, Spotify provides Squads with access to an agile coach who helps them evolve and improve their way of working, and can even help run meetings.
 Poppendieck, Mary, and Thomas David Poppendieck. (2010). Leading lean software development: results are not the point. Upper Saddle River, NJ: Addison-Wesley.