Index

Note: Figures are indicated with f; footnotes are indicated with n; tables are indicated with t.

SYMBOLS

%C/A percent complete and accurate), 11, 85f, 95

NUMBERS

2/4 outage, 318–320

18F team, 369–371

A

A/B testing

conclusions on, 280

feature planning and, 278

feature testing and, 276–277

feature toggles and, 196–197

history of, 275

need for, 273

our release and, 277–278

practical guide to, 276

Accelerate: The Science of Lean and DevOps, xln, 14, 219

acceptance stage, 152f, 153

acceptance test-driven development (ATDD), 159

acceptance tests, 36, 155–156

accountability, as cultural pillar, 76

Adams, Keith, 338

Adidas, 286–287

Agile

DevOps as continuation of, xxv

Infrastructure Movement, 5–6, 410

Manifesto, 4–5, 410

Agile Software Development with Scrum, 122n

Aisen Seiki Global, 52

Alcoa, 50, 313

alert fatigue, 247

Algra, Ingrid, 343n

Allspaw, John, xxiii, 5, 6, 197, 234, 283, 284, 298, 308, 319, 360, 385n, 410

Amazon

architecture transformation at, 210, 212–213

continuous delivery at, 200–201

DevOps myths and, xxiv–xxv

market-oriented teams at, 102

“master of disaster” at, 316, 317, 399

post-mortem, 311n

service-oriented architecture (SOA) at, 109–110

two-pizza teams at, 110, 111

Amazon Auto Scaling (AAS), 251–252

Amazon Reboot of 2014, Great, 315–316

Amazon Web Services (AWS)

CloudWatch, 391

outage, 305–306, 315

security solutions architect at, 389

ambiguous threats, 313–314

American Airlines

brownfield transformation, 68

case study, 15–18, 74–77

new vocabulary at, 74–77t

anchoring bias, 388

Anderson, David J., 22

Andon button, 38n

Andon cord

description of, 37–39

Excella, 39–41f

for low-risk releases, 181

work stoppage and, 416–417f

anomaly detection

advanced, 255–257

anomaly detection (continued)

defined, 253

Kolmogorov-Smirnov test, 254, 255, 256f, 257

smoothing for, 253–254

Antani, Snehal, 354

antifragility, 52

APIs

cleanly defined, 214

enablement, 112–114

loosely coupled architecture and, 209

microservice-based architecture driven by, 388

self-service, 230

versioned, 214

application logging telemetry, 231–233

application performance monitors, 235n

application-based release patterns

dark launching, 190, 197–199

defined, 190

feature toggles, 190, 195–197

Arbuckle, Justin, 324, 354

Architectural Review Board (ARB), 333

architecture

Amazon, 212–213

Blackboard Learn, 215–217f

conclusions on, 218

downward spiral in, 208–209

eBay, 207–208

loosely coupled, 26–27, 209, 217

monoliths versus microservices, 210–212

overly tight, 26–27

service-oriented, 109, 210

strangler fig application pattern and, 70, 208, 213–217, 218

Architecture and Patterns for IT, 208

Art of Monitoring, The, 228, 230f

Ashman, David, 215, 216, 217f

ATM cash machines, 392

Atwood, Jeff, 170, 171, 293

Audit Defense Toolkit, DevOps, 391

auditors and compliance officers

ATM cash machines and, 392

PCI compliance at Etsy, 385–387

proving compliance, 389–391

tension between IT and, 389

Austin, Jim, 245

automated testing

Andon cord and, 163–165

categories of tests, 155–156

conclusions on, 166

constraints and, 26

deployment pipeline infrastructure, 151–154

essential components of, 165–166

fast and reproducible tests, 166

at Google, 148–151

green build and, 154, 163, 166

ideal testing pyramid, 157f–158

need for, 147–148

non-functional requirements and, 162–163, 328

observability and, 147

performance testing environment, 161–162

reducing reliance on manual tests, 160–161

research supporting, 165–166

running tests in parallel, 158f–159

test-driven development (TDD), 159, 161, 327

automation, DevOps and, xxvi

Ayers, Zack, 39, 41f

B

bad apple theory, 307

Baker, Bill, 141

banks, as IT companies, xxxv

batch sizes, reducing, 9, 22–24f, 409

Bazaarvoice, 173–175, 176

Beck, Kent, 159, 178

Beedle, Mike, 122n

Behr, Kevin, xlii, 225, 233n, 415f

Bell Labs, 54–55

Besnard, Denis, 416

Betz, Charles, 208

Beyond The Phoenix Project, 57

Big Fish Games, 115–117

big-bang approach, 73

bimodal IT, 71, 72

Blackboard Learn, 215–217f

blameless post-mortems

CSG case study, 318–320

defined, 308

inviting Ops engineers to, 117, 123–124

for organizational learning, xxxix, 48–49

publishing reports of, 311–312

sample agenda, 418–419

scheduling of, 308–310

Bland, Mike, 148, 149, 344, 345, 370

Blank, Steve, 411

Blankenship, Ed, 240

blitz, improvement, 335, 336

blue-green deployment pattern

description of, 190, 191f–192

for point-of-sale system, 193

Bohmer, Richard M. J., 313

Booch, Grady, 6n

Bosu, Biswanath, 388

bottlenecks

DevOps and, 405–406

generalists and, 106t

handoffs, queues, and, 414

Boubez, Toufic, 249, 250f, 251f, 255, 256

bounded contexts, 109

Bouwman, Jan-Joost, 343n

Brakeman, 359, 362f–363

branches, feature, 170

branching by abstraction, 214n

branching strategies, 167, 170–171

Brittain, Mike, 241f, 262f

Brook, Frederick, xli

Brooke’s Law, 286

brownfield projects

case study, 69–71

defined, 67

greenfield projects versus, 66–69

technical debt and, 67

Building Blocks, Blackboard Learn, 215–217f

Building the Future: Big Teaming for Audacious Innovation, 312

bureaucratic organizations, 42, 47, 48t

bureaucratic processes, cutting, 42–43, 296–297, 299

Burgess, Mark, 6n

burnout, decreased, xl, 15, 175

business relationship manager, 116

Buytaert, Kris, 343n

C

Cagan, Marty, 90

Campbell-Pretty, Em, 135–136

Canahuati, Pedro, 105, 263

canary release pattern, 190, 194f–195

canary release test, 177n

Capital One

biz and tech partnership at, 387–389

case study, 342–343, 387–389

Got Goo? program, 296

internal conferences, 342–343

cardholder data breaches, 364, 371

cardholder data environment (CDE), 385–386

case studies

Adidas, 286–287

Amazon, 212–213

American Airlines, 15–18, 74–77

ATM cash machines, 392

Bazaarvoice, 173–175, 176

Bell Labs, 54–55

Blackboard Learn, 215–217f

Capital One, 342–343, 387–389

CSG International, 181–183, 201–206, 318–320

Dixons Retail, 193

Etsy, 186–188, 332, 373–375, 385–387

Excella, 39–41f

Facebook, 198–199

Fannie Mae, 376–378

Federal Government agencies, 369–371

Google, 269–271, 290–291

hospital system, 29–32

hotel company, 143–144

Kessel Run refueling system, 66–69

LinkedIn, 91–93, 237–238

Nationwide Building Society, 124–127

Nationwide Insurance, 342

Netflix, 251–253

Pivotal Labs, 293–294

Salesforce.com, 383–384

Target, 112–114, 333–334, 342, 343

tax collection agency for UK, 77–80

Twitter, 360–363

Yahoo! Answers, 278–280

Chacon, Scott, 281, 282f

Chakrabarti, Arup, 263

Change, John Shook’s Model of, 205

change advisory board (CAB), 380, 382

change approval processes

case studies, 383–384, 385–387, 387–389

dangers of, 283–284

at Etsy, 385–387

normal changes, 380, 381–382

at Salesforce.com, 383–384

security and compliance in, 379–380

standard changes, 379–381, 383–384

three categories of changes, 379–380

urgent changes, 380

change control failure, 283

change freezes, 292

Chaos Gorilla, 420

Chaos Kong, 420

Chaos Monkey, 52, 55, 306–307, 315

Chapman, Brent, 347

Chapman, Janet, 124, 125, 126f

chat rooms

announcing changes with, 288

drawbacks of, 94–95

Hubot at GitHub, 321–323

organizational knowledge and, 321–322

shared goals and, 94

as watercooler, 322

Chuvakin, Anton A., 232

Clanton, Ross, 74, 75f, 76, 77t, 297, 335, 336, 343

Claudius, Jonathan, 375

Clemm, Josh, 91–93

cloud computing, five characteristics of, 330–331

cloud native, 306

Cloud System Administration, The Practice of, 325

Cloud.gov, 370, 371

cluster immune system, 190, 195n

coaching kata, 53

Cockcroft, Adrian, xxxit, 102n, 231, 296

code

infrastructure as, 6n

libraries, 356

maintainability, 326

repositories, 355–357

signing, 359–360

Code Climate, 359

code commits

automated tests on, 160, 166

daily, 172

gated commits, 172

Google, 150, 290

Pivotal Labs, 294

security and, 357

strangler fig application pattern, 215–217f

code reviews. See also change approval processes

ATM systems and, 392

change reviews versus, 289n

defined, 288

e-mail pass around, 290

forms of, 289–290

Google, 290–291

guidelines for, 288–289

learning-based culture and, 339

“over the shoulder,” 290

pair programming, 289, 292–294

Pivotal Labs, 293–294

requesting, 282

separation of duty and, 184, 384–385, 386

size of change and, 289

tool-assisted, 290

unauthorized access and, 375, 376

Codecov security breach, 368–369

Cohen, Joshua, 39, 41f

Collins, Justin, 360, 361

Columbia space shuttle, 313

commit stage, 152f, 153

Common Vulnerabilities and Exposures (CVE), 365

communities of practice, 343–345

compliance officers and auditors

ATM cash machines and, 392

Payment Card Industry Data Security Standards (PCI DSS), 385

PCI compliance at Etsy, 385–387

proving compliance, 389–391

separation of duty and, 379, 384–387

tension between IT and, 389

complicated-subsystem teams, 112

conferences

internal, 342–343

sharing experiences from DevOps, 341–342

Conformity Monkey, 420

Conrad, Ben, 78

constraint identification, 25–27

Constraints, Theory of, 4, 412–413

Consul, 242n

containers, 143–144, 152, 153n

contextual inquiry, 264

continual learning and experimentation, 3, 45–56

continuous delivery. See also deployment process; low-risk releases, enabling

continuous deployment versus, 199–201

defined, 133, 200

elite performance and, 201

infrastructure monitoring and, 243

low-risk releases and, 199–201

Continuous Delivery: Reliable Software Releases Through Build, Test, Deployment Automation, xxiv, 151, 193, 199, 219

Continuous Delivery Movement, 6, 410–411

continuous integration (CI) and testing

defined, 36, 151n

deployment pipeline infrastructure and, 151–154

continuous integration practices

Andon cord and, 163–165

at Bazaarvoice, 173–175, 176

catching errors early, 157–163

reliable automated validation test suite, 154–156

three capabilities required for, 154

trunk-based development, 167–176

convergence of DevOps, 409–412

Conway, Melvin, 97

Conway’s Law

conclusions about, 114

defined, 61, 97–98

at Etsy, 98–100, 108

organizational archetypes and, 100–101

Target case study and, 112–114

team boundaries in accordance with, 108

two-pizza team and, 110–111

Cook, Richard, 58, 319

Cook, Scott, 274

core conflict in IT operations, xxxii–xxxiii, 412f–413

Corman, Josh, 353, 363, 368, 412

Cornago, Fernando, 286

costs of IT-related deficiencies, xxxvi–xxxvii

COTS software, 417–418

counterfactual thinking, 283n, 310

COVID-19 pandemic

call centers during, 126–127

remote work and, 108n

UK’s financial support package, 77–80

Cox, Jason, 107, 119–120, 296

crowdsourcing technology governance, 333–334

CSG International

blameless post-mortem, 318–320

brownfield transformation, 68

case study, 181–183, 201–206, 318–320

daily deployments at, 181–183

generalists, 106–107

organizational transformation, 201–206

culture, organizational

high-trust, xxxix, 45, 48, 150

importance of, 46

just culture, 47, 307–308

learning-based, 48–49, 339

safety culture, 46–49

three pillars of, 75–76

three types of, 47, 48t

culture of causality, 225, 233n

Cundiff, Dan, 333

Cunningham, Ward, xxxii, 171

customer acquisition funnels, 240, 273, 275, 278

customers, two types of, 43

D

daily work, improvement of, 49–50

daily work of development

at Big Fish Games, 115–117

conclusions on, 127–128

embedded Ops engineers in, 116, 119–120

Ops liaisons in, 116, 117, 120–121

shared services in, 117–118

team rituals in, 117, 121–124

dark launch

defined, 190, 197–198

Facebook Chat, 198–199

dashboard

Adidas, 287

creating a simple, 237n

daily work and, 234

Etsy, 227

public health, 421

Data Breach Investigation Report (DBIR), 364

database changes, dealing with, 192

database syntax error, 374

Davenport, Shawn, 295n

Debois, Patrick, xxiii, 5, 6, 405–406, 410

DEBUG level, 232

dedicated release engineer, 116

defects, as waste, 28

DeGrandis, Dominica, 22, 58

Dekker, Sidney, 34, 47, 58, 307, 347, 395

(Delicate) Art of Bureaucracy, The, 299

Deming, W. Edwards, 38, 54

demonstrations, compliance by, 354

dependency scanning, 359

deploy code button, 186

Deployinator console, Etsy, 187–188f

deployment lead time, 8–11f, 409

deployment pipeline foundations

Andon cord and, 163–165

containers, 143–144, 152, 153n

defined, 6, 151, 152f

definition of “done,” 144–145

Enterprise Data Warehouse, 135–137

goal of deployment pipeline, 153

infrastructure, 151–154

on-demand creation of test environments, 137–138

rebuilding instead of repairing infrastructure, 141–143

single repository of truth, 139–141, 150, 324–327

deployment pipeline protection

ATMs and production telemetry, 392

auditors and, 389–391

Capital One, 387–389

change advisory board (CAB) and, 380, 381, 382

change approval processes and, 379–380

Etsy, 385–387

for normal changes, 380, 381–382

separation of duty, 379, 384–387

for standard changes, 379–381, 383–384

deployment process

Andon cord and, 181

automating, 179–181

continuous deployment, 199–201

CSG International, 181–183, 201–206

decouple deployments from releases, 189–199

Dixons Retail, 193

Etsy, 186–188

Facebook, 177–179

release versus, 189

deployment process (continued)

self-service deployments, 184–188

smoke testing deployments, 180, 187

deploys per day per developer, xxxviii, xxxxixf–xl

destructive testing, 383–384

Dev team rituals, 121–124

Dev tests, 36

developer productivity

DevOps practices and, xlif

measuring, 401–403

shared services for, 117–119

development, daily work of

at Big Fish Games, 115–117

conclusions on, 127–128

embedded Ops engineers in, 116, 119–120

Ops liaisons in, 116, 117, 120–121

shared services in, 117–118

team rituals in, 117, 121–124

development, hypothesis-driven

at Intuit, 273–275

need for, 273, 280

development, test-driven (TDD). See also automated testing

building incrementally with, 161

defined, 159

shared libraries and, 327

study on, 159n

development, trunk-based

adopting practices of, 172

at Bazaarvoice, 173–175, 176

benefits of, 175

conclusions on, 175–176

gated commits and, 172

for HP’s LaserJet Firmware division, 168–170

need for, 167, 171

DevOps

breaking downward spiral with, xxxv–xxxvii

business value of, xxxvii–xxxix

convergence of, 409–412

core conflict in IT operations and, xxxii–xxxiii, 412f-413

developer productivity and, xlif–xlii

downward spiral in IT and, xxxiii–xxxvii, 413t–414t

history of, 3–6

myths about, xxiv–xxv

outcomes created by, xxviii, xxix

principles underpinning, 12–14

DevOps Days. See DevOpsDays

DevOps Enterprise Summit, xxxix, 341–342, 343

DevOps journeys. See case studies

DevOpsDays, xxiii, xxiv, 6, 341, 343n, 353, 355, 410

Dickerson, Chad, 99

Dignan, Larry, 110, 111

direct response marketing, 275

Disaster Recovery Program (DiRT), Google’s, 317

Disney, 107, 119–120, 296

Dixons Retail, 193

Doctor Monkey, 420

Dojo, DevOps, 335–336

Dojos, Getting Started With, 58

“done,” modifying definition of, 144–145, 172

DORA, State of DevOps Reports from, 5, 14, 57, 103, 140, 165, 166, 175, 185, 201, 217, 227, 243, 285, 312, 326, 331, 343

downward spiral in IT

description of, xxxiii–xxxvii, 413t–414t

DevOps for breaking, xxxvii–xxxix

Drucker, Peter, 75, 80

Dunbar, Robin, 111n

Dunbar’s number, 111n

Dweck, Carol, 107

dynamic analysis, 354

E

early adopters

defined, 72f, 73

finding, 73–74

eBay, 90, 207–208

economic costs of IT-related deficiencies, xxxvi–xxxviii

Edmondson, Amy C., 312, 313, 347

Edwards, Damon, 10f, 28, 118

Eli Lilly, 312

Eltridge, Patrick, 124, 125, 126f, 127

email pass around code review, 290

emergency change advisory board (ECAB), 380

employee burnout, lower rates of, xl, 175

enabling teams, 112

Encasement Strategy, 70

Eno, Brian, 54, 55

Enterprise Data Warehouse, 135–137

envelope game, simulation of, 23, 24f

environment-based release patterns

blue-green deployments, 190, 191f–192, 193

canary releases, 190, 194f–195

cluster immune systems, 190, 195

defined, 190

ERROR level, 232

ethics of DevOps, xxxvii–xxxix

Etsy

blameless post-mortems, 310, 311, 419

brownfield transformation, 68–69

case studies, 186–188, 332, 373–375, 385–387

cluster immune system, 195n

continuous delivery at, 201

Conway’s Law and, 98–100, 108

designated Ops liaison at, 120–121

DevOps myths and, xxiv–xxv

DevOps transformation at, 226–227

experimentation, 277–278

functional orientation, 101n, 104

LAMP stack, 226

learning-based culture, 308

Morgue tool, 311–312

organizational learning, 49

PCI compliance at, 385–387

PHP run-time warning, 262f

retrospective meetings, 310, 311, 419

self-service deployment, 186–188

separation of duty, 385–387

standardizing technology stack at, 332

StatsD metric library, 234, 235f

transformation projects, 63

Evans, Eric J., 109

Evans, Jason, 338

event router, 229, 230f

Excella, Andon cord at, 39–41f

Expanding Pockets of Greatness: Spreading DevOps Horizontally in Your Organization, 129

experimentation, rapid

A/B testing, 273, 275–278

customer acquisition funnel and, 240, 273, 275, 278

at Etsy, 277–278

at Intuit, 273–275

need for, 273

at TurboTax, 274

at Yahoo! Answers, 278–280

exploratory testing, 36

Explore It!: Reduce Risk and Increase Confidence with Exploratory Testing, 36, 219

extra features, 28

extra processes, 28

F

Facebook

canary release pattern, 194f–195

case study, 198–199

code deployment, 177–179

continuous delivery, 201

experimentation, 279

feedback, 105

Gatekeeper, 196n, 199

hackathon, 338

Facebook Chat, dark launch, 198–199

fail fasts, 316n

failure parties, 312

failures

blameless post-mortem at CSG, 318–320

calculated risk-taking and, 314–315

game days to rehearse, 316–318

no fear of, 55

publishing post-mortem reports, 311–312

redefining, 314–315

rehearsing and injecting, 315–316

retrospective meetings after, 308–310

weak failure signals, 313–314

fallbacks, 316n

Fannie Mae, 376–378

Farley, David, 6, 151, 152f, 156, 158f, 193, 199, 214, 410

Farley, Steve, 339, 342

Farr, Will, 295n

Farrall, Paul, 115–116

fast and reproducible tests, 166

FATAL level, 232

Fearless Organization, The, 347

feature branches, 170

feature flags. See feature toggles

feature freezes, 175

feature toggles, 195–197, 277

features

extra, 28

planning, 278

testing, 276–277, 279

user excitement and, 241f

Federal Government agencies, 369–371

feedback. See also The Second Way (principles of feedback)

Andon cord and, 37–41

customer observation, 264–265

cycle times, 37f

fast and constant, 10, 13–14, 33

optimizing for downstream work centers, 43–44

quality control closer to the source, 42–43

pager rotation duties, 263–264

principles of, 3, 13–14, 33

production telemetry and, 261–262

safe deployment of code and, 259–261

safety within complex systems, 33–34

seeing problems as they occur, 35–36, 244

self-management by developers, 265–271

stakeholder, 36

swarming, 37–41

types and cycle times, 36–37f

user, 36, 37f

Fernandez, Roberto, 73, 100

Fifth Discipline, The, 35, 49

First Way, The. See The First Way (principles of flow)

Fitz, Tim, 6, 192n, 199, 411

Five Dysfunctions of a Team: A Leadership Fable, 347

fix forward, 262

fixed mindset, 107

fixits, 345

Flickr, 197–198, 278, 360

flow, principles of, 3. See also The First Way (principles of flow)

flow metrics, 11–12

focusing steps, five, 26, 32

following work downstream, 264–265

Forsgren, Nicole, xxxix–xl, 5, 14, 140, 228f, 363, 401–403, 404

Fowler, Martin, 156, 157, 213, 214, 220, 306n

fraud, defined, 373

Fryman, James, 295n

full-stack engineer, 106, 386

functional-oriented organizations

defined, 100

DevOps outcomes in, 103–104

market orientation versus, 100–101, 103f

problems of, 101–102

funding services and products, 107–108

Furtado, Adam, 69, 70

G

Gaekwad, Karthik, 339

Galbreath, Nick, 259–261, 355, 373–375

Gall’s Law, 70

Gallimore, Jeff, 41

game days, 52, 316–318

Ganglia, 226, 229

gated commits, 172

Gatekeeper, Facebook’s, 196n, 199

Gauntlt security tool, 353, 357

Gaussian distribution, 247f, 249, 253

GE Capital, 324, 354

Geer, Dan, 368

Geinert, Levi, 333

General Electric, CEO of, xxxi–xxxii

General Motors manufacturing plant, 35, 38, 45

generalists, 105–107

generative organizations, 47–48t, 57

Gertner, Jon, 54, 55

GitHub

functional orientation, 101n, 104

Hubot, 321–323

Octoverse Report, 57, 368, 401

peer review, 281–283

pull request processes, 295–296

vulnerability timeline, 368

GitHub Flow, 282

The Goal: A Process of Ongoing Improvement, xlii, xliii, 30, 406

goals

global, 19, 21

improvement, 88

Goldratt, Eliyahu M., xxxiii, xlii, 25, 26, 32, 406

Goldratt’s Theory of Constraint, 25–26, 32

Google

architecture, 209f–210

automated testing, 148–151

case study, 269–271, 290–291

code reviews, 290–291

continuous delivery, 200–201

DevOps myths and, xxiv–xxv

disaster recovery program, 317–318

grouplets, 344, 345

launch and hand-off readiness review, 269–271

retrospective documents, 311

service-oriented architectures (SOA), 109–110

shared source code repository, 300, 324–326

Testing on the Toilet newsletter, 149n, 344

Web Server team, 148–151

Google Cloud Datastore, 209f, 210

Got Goo? program, 296

Gothelf, Jeff, 411

Govindarajan, Vijay, 86, 87

Goyal, Rakesh, 387, 388

Grafana, 79, 234, 254, 255

Gramm-Leach-Bailey Act, 389

Graphite, 226, 234, 235f, 236, 254, 255, 322, 374f

Gray, Jim, 212

green build, 154, 163, 166

greenfield vs. brownfield services, 66–69

grouplets, 344, 345

growth mindset, 107

Gruver, Gary, 43, 148, 160, 168, 169, 170

guardrails, 79

Gupta, Prachi, 237, 238

H

Haber, Eben, 5

hackathon

defined, 337n

Facebook, 338

Hamilton, James, 306n

Hammant, Paul, 214n

Hammond, Paul, xxiii, 5, 6, 360, 410

hand back mechanism, 268–269

hand-off readiness review (HRR), 269, 270, 271f

handoffs, 20, 24–25, 414–415

hardships and waste, 27–29

healthcare organizations

generative cultures and, 47–48t

HIPAA requirements, 390–391

hospital case study, 29–32

helplessness, learned, xxxvi

Hendrickson, Elizabeth, 36, 160, 219, 293–294, 300

heroics, 10, 28–29, 170

High-Velocity Edge, The, xxxv

HIPAA, 390–391

HipHop virtual machine project (HHVM), 338

history

A/B testing, 275

DevOps, 3–6

software delivery, xxx–xxxit

HMRC tax collection agency, 77–80

Hodge, Victoria J., 245

holdouts, identifying, 74

Hollnagel, Erik, 416

Holmes, Dwayne, 143, 144

hospital system case study, 29–32

HP LaserJet, 69

HP’s LaserJet Firmware division, 168–170, 176

HSBC bank, xxxvn

Hubot, at GitHub, 321–323

Humble, Jez, xxii–xxiii, xl, 6, 151, 152f, 156, 158f, 191f, 199, 200, 207, 219, 273, 277, 284, 404–405, 406, 410

Hyatt, Matt, 78, 79, 80

hybrid schedules, 108n

hypothesis-driven development

at Intuit, 273–275

need for, 273, 280

I

Idea Factory: Bell Labs and the Great Age of American Innovation, The, 54

ideal testing pyramid, 157f–158

Imbriaco, Mark, 322

Immelt, Jeffrey, xxxi–xxxii

immersive learning opportunities, 16, 18

immutable infrastructure, 142

immutable services, 214

imposter syndrome, 148n, 310

improvement blitz, 335, 336

improvement goals, 88

improvement kata, 6

improvement of daily work, 49–50

INFO level, 232

information radiators, 236, 237, 241

information security. See also deployment pipeline protection

18F team, 369–371

bad paths, 358

Brakeman, 359, 362f–363

change approval processes, 379–380

code signing, 359–360

data breaches, 364, 368–369, 371

defect tracking, 355

dependency scanning, 359

deployment pipeline and, 357, 375–376

DevOps and, 353

dynamic analysis, 354

early-stage product demonstrations, 354

Etsy, 373–375, 385–387

Fannie Mae, 376–378

Gauntlt security tool, 353, 357

happy path, 358

Open Web Application Security Project (OWASP), 359n, 360

OWASP Dependency Check, 368n

OWASP ZAP, 358f, 359

Payment Card Industry Data Security Standards (PCI DSS), 385

post-mortems and, 355

preventive security controls, 355–357

production telemetry and, 371–373

Rugged DevOps, 353

sad and bad paths, 358

separation of duty, 379, 384–387

shared source code repositories and, 355–357

shifting security left, 376–378

silo, 353

software supply chain and, 363–369

source code integrity and code signing, 359–360

SQL injection attacks, 373, 374f

static analysis, 358f–359

Twitter case study, 360–363

Infosec. See information security

Infosec team, 83

infrastructure

centralized telemetry, 227–231

changes, 383–384

as code, 6n

deployment pipeline, 151–154

immutable, 142

metrics, 242–243

rebuilding instead of repairing, 141–143

ING technology organization, 74

innovators and early adopters

defined, 72f–73

finding, 73–74

integration tests, 156

Intuit, 273–275

IT operations

core conflict in, xxxii–xxxiii, 412f–413

developer productivity, xlif–xlii

DevOps and, xxvi

downward spiral in, xxxv–xxxix, 413t-414t

impact of DevOps on, xxxvii–xxxix

ITIL, xxvii, 139, 263n, 285, 286, 379, 380n

ITIL CMBD, 242n

J

Jacob, Adam, 6n

Jacobson, Daniel, 252f, 254f

Janitor Monkey, 420

Java

automation, 161

Bazaarvoice, 173

dependency scanning, 359

EAR and WAR files, 152

LinkedIn, 91

Maven Central, 364

ORM, 99n

JavaScript

application logging, 231n

client software level, 239

CSG, 318

eBay, 207n

Facebook, 199

libraries, 326

NPM, 364

open-source dependencies, 363

StatsD, 234

Jenkins, 152, 153, 180, 187, 205, 322, 358f, 377

JIRA, 355, 377, 381, 382, 386

Johnson, Kimberly, 376, 377–378

Jones, Angie, 161

Jones, Daniel T., 23

Joshi, Neeraj, 252f, 254f

just culture, 47, 307–308

Just Culture, 347

K

kaizen blitz, 49, 335, 336

Kalantzis, Christos, 315–316

Kanban: Successful Evolutionary Change for Your Technology Business, 22

kanban boards

example, 20f–21

Ops work on, 124

shared goals and, 94

Toyota Production System, 4

Kandogan, Eser, 5

Kanies, Luke, xxiii, 6n

Kastner, Erik, 187, 188f

Kelly, Mervin, 54, 55

Kersten, Mik, 12, 54

Kessel Run mid-air refueling system, 69–71

Kim, Gene, xxi–xxii, xl, xlii, 13f, 47n, 54, 57, 58, 225, 233n, 265, 284, 295n, 300, 353n, 364, 385n, 403–404, 406, 415f

Kissler, Courtney, 63, 64, 65, 66, 81–82, 83

Knight Capital failure, 283

Kohavi, Ron, 276, 277

Kolmogorov-Smirnov test, 254, 255, 256f, 257

Krishnan, Kripa, 317–318

Kumar, Ashish, 291f

L

laggards (skeptics), 72f, 73

LAMP stack

DevOps myth and, xxvi

Etsy, 226

large batch sizes

merges of, 170–171

small versus, 9, 22–24f, 409

Latency Monkey, 420

latent defects, 317

Lauderbach, John, 108n

launch guidance, 266–267

launch readiness review (LRR), 269, 270, 271f

Lead Architecture Review Board (LARB), 296, 297

lead time

defined, 409

focus on, 8, 18

Lean Movement and, 409

of minutes, 10–11f

process time versus, 9f

queue size and, 22, 415f

of three months, 10f

leaders

role of, 52–54

vocabulary for, 76–77t

Lean Enterprise: How High Performance Organizations Innovate at Scale, 278

Lean Manufacturing, xxx, 7, 8, 9

Lean Movement

description of, 3, 409

missing element in, 6

Lean Startup, The, 411

Lean UX movement, 411

LeanKit, 381

learned helplessness, xxxvi

learning-based culture

ASREDS learning loop, 340, 341f

communities of practice, 343–345

conclusions on, 346

DevOps conferences, 341–342

Etsy, 49

grouplets, 344, 345

importance of, 46

improvement blitz, 335, 336

internal conferences, 342–343

just culture, 47, 307–308

rituals to pay down technical debt, 336–339

Teaching Thursday, 339

thirty-day challenge at Target, 335–336

safety culture, 46–49

trust and, xli, 45, 48, 150

Leibman, Maya, 15, 16, 17, 75f, 77t

Lencioni, Patrick, 347

Lesiecki, Nick, 344

Letuchy, Eugene, 198, 199

Levenberg, Josh, 149, 300

liaisons, Ops

assigning, 117, 120–121

purpose of, 115–117

team rituals and, 121–124

two types of, 116

libraries, shared, 326, 327

Lightbody, Patrick, 263

limiting work in progress (WIP), 7, 21–22

Limoncelli, Tom, 248, 270, 271, 325–326

LinkedIn

case study, 91–93, 237–238

former monolithic architecture, 210

Operation Inversion, 91–93

self-service metrics, 237–238

Little, Christopher, xxxv, xxvi, 94

logging infrastructure, 231n

logging telemetry, 231–233

logs

centralized, 229

in monitoring framework, 230f–231

Lonestar Application Security Conference, 353

Long, Jeremy, 365

long-lived private branches, 170

loosely coupled architecture, 26–27, 209, 217

Love, Paul, 353n

low-risk releases, architecture for

at Amazon, 212–213

at Blackboard Learn, 215–217f

conclusion on, 218

downward spiral and, 208–209

at eBay, 207–208

loosely coupled architecture, 26–27, 209, 217

monoliths versus microservices, 210–212

service-oriented architecture, 109, 210

strangler fig application pattern and, 70, 208, 213–217, 218

low-risk releases, enabling

Andon cord and, 181

application-based release patterns, 190, 195–199

automating deployment process, 179–181

conclusion on, 206

continuous delivery and, 199–201

at CSG International, 181–183, 201–206

dark launching, 197–199

environment-based release patterns, 190, 191–195

at Etsy, 186–188, 201

at Facebook, 177–179, 201

feature toggles, 195–197

self-service deployments, 184–188

Luyten, Stefan, 24f

M

Macri, Bethany, 49, 419

Magill, Stephen, 364

Maglio, Paul, 5

Making Work Visible: Exposing Time Theft to Optimize Work & Flow, 22, 58

Malpass, Ian, 227, 235f, 310

Mangot, Dave, 383, 384

manual tests

automating, 160–161

high reliance on, 10

manufacturing value stream, 7

market-oriented organizations, 100–101

market-oriented teams, 102–103

Marsh, Dianne, 118

Martens, Ryan, 94

Martin, Karen, 7, 9n, 11

mass marketing, 275

Massie, Bill, 385–387

Matatall, Neil, 360, 361

Mathew, Reena, 383, 384

matrix-oriented organizations, 100

Maurer, Dan, 274

Maven Central repository, 364, 365

Maximilien, E. Michael, 159n

McChrystal, Stanley, 347

McDonnell, Patrick, 226

McKinley, Dan, 332

mean time to recover. See MTTR

means and standard deviations, 246–248

Measuring Software Quality whitepaper, 58

Mediratta, Bharat, 149, 344

mentors, test certified, 345

Messeri, Eran, 150, 290, 325

Metasploit, 359, 369

metrics

actionable business, 240–241f

application and business, 240–242

flow, 11–12

infrastructure, 242–243

library (StatsD), 234

MTTR and, 227

production, 234–235

self-service, 236–238

Metz, Cade, 338

Mickman, Heather, 112–114, 297, 343

microservices

monoliths versus, 210–212

pros and cons of, 211t

shifting to, 212–213

Microsoft Excel, 255

Microsoft Office Framework (MOF) Study, 225

Milstein, Dan, 310

Minimum viable product (MVP), 388

mistakes. See also failures

examining, 308–310

in learning-based culture, 307–308

redefining failure, 314–315

Model of Change, John Shook’s, 205

monitoring frameworks, 227–231

monolithic architectures

defined, 210, 211t

shifting from, 212–213

Moore, Geoffrey A., 72

Morgue tool, Etsy’s, 311–312

Morrison, Erica, 183f, 204, 206, 318

motion waste, 28

MTTR

batch size and, 292

culture of causality and, 233n

daily deployments and, 183f

defined, 16

fact-based problem-solving and, 234

high performers and, 186f, 227, 228f

low-risk changes and, 381

Morgue tool for recording, 311

optimizing for, 261n

tracking metrics and, 227

Mueller, Ernest, 117n, 122, 173–175, 237

Mulkey, Jody, 104–105

Multichannel Digital Tax Platform (MDTP), 78–80

multitasking, 21–22

multivariate testing, defined, 276

MySQL, xxvi, xxxit, 226, 296, 332

Mythical Man-Month, The, xli

myths

DevOps, xxiv–xxvi

industrial safety, 416t

N

Nagappan, Nachi, 159n

NASA, 231, 313, 314

National Institute of Standards and Technology (NIST), 330, 331, 389

National Vulnerability Database, 368

Nationwide Building Society, 124–127

Nationwide Insurance, 339, 342

Naval Reactors (NR), 51

.NET, xxviii, 363

Netflix

Archaius library, 196n

auto-scaling capacity, 251–253

case study, 251–253

Chaos Monkey, 52, 55, 306–307, 315, 420

cluster immune system, 195n

DevOps myths and, xxiv–xxv

as digital-native company, 15

market-oriented teams, 102

shared services, 118

Simian Army, 369n, 420

telemetry, 245–246

Newland, Jesse, 321, 323

Nmap, 359, 369

non-functional requirements (NFRs), 89, 90f, 162–163, 328

Nordstrom, 15, 63–66

normal changes, 380, 381–382

North, Dan, 191f, 193, 232

Nygard, Michael, 315, 347

O

object relational mapping (ORM), 99n

observability, and testing, 147

Octoverse Report, 57, 368, 401

O’Donnell, Glenn, 340

O’Neill, Paul, 50, 313

Open Web Application Security Project (OWASP), 359n, 360

Operation Desert Shield, 189n

opinion of the team, 40

opinionated platform, 79

Ops engineers on service teams, embedding, 119–120

Ops liaisons

assigning, 117, 120–121

purpose of, 115–117

team rituals and, 121–124

two types of, 116

O’Reilly, Barry, 278

organizational archetypes, 100–101

organizational culture

importance of, 46

just culture, 47, 307–308

safety culture, 46–49

three pillars of, 75–76

three types of, 47, 48t

trust and, xli, 45, 48, 150

organizational goals, and technology choices, 329–330

organizational knowledge

chat rooms for capturing, 321–323

easy-to-reuse, 323–324

Ops user stories and, 328–329

recommend_tech program at Target, 333–334

source code repository for, 139–141, 150, 324–327

organizational learning

ASREDS learning loop, 340, 341f

communities of practice, 343–345

conclusions on, 346

DevOps conferences, 341–342

Etsy, 49

grouplets, 344, 345

improvement blitz, 335, 336

internal conferences, 342–343

rituals to pay down technical debt, 336–339

Teaching Thursday, 339

thirty-day challenge at Target, 335–336

ORM (object relational mapping), 99n

Orzen, Mike, 49

Osterling, Mike, 7, 9n, 11

Otto, Andreia, 287

outages

2/4 outage, 318–320

Adidas, 286

Amazon Web Services (AWS), 305–306, 315

culture of blame around, 233

culture of causality and, 225

mastering, 347

Netflix, 314–315

outlier detection, 245–246

outsourcing contracts, IT, 102n

“over the shoulder” code review, 290

OWASP (Open Web Application Security Project), 359n, 360

OWASP Dependency Check, 359, 365, 368n

OWASP ZAP, 358f, 359

Özil, Giray, 289

P

pager rotation duties, 263–264

pair programmed rollback, 164n

pair programming, 289, 292–294

pairing hours, 293n

Pais, Manuel, 111, 117, 129

Pal, Tapabrata, 296, 342–343, 353n

Pandey, Ravi, 336

Parker, Doug, 76–77

Paroski, Drew, 338

partially done work, 28

passion, as cultural piller, 75

passwords, OWASP guidance on, 360

pathological organizations, 47, 48t

Payment Card Industry Data Security Standards (PCI DSS), 385, 386

PayPal, 73n

PDCA (Plan, Do, Check, Act) cycle, 38, 54

peer review

of changes, 288–290

code review basics, 288–291

email pass around, 290

GitHub, 281–283

“over the shoulder” code review, 290

pair programming, 289, 292–294

pull requests, 281–283, 295–296

tool-assisted code review, 290

performance testing environment, 161–162

Perrow, Charles, 34

Phoenix Project, The, xxii, xlii, xxxix, 12, 404, 406, 413, 414

PHP

Conway’s Law, 98, 99

Etsy, 226, 262f, 332

Facebook, 177n, 178, 199, 338

LAMP stack and DevOps, xxvi

as representative technology, xxxit

Pinterest, 114

Pivotal Labs, 293–294

planning horizons, short, 88–89

platform teams, 112

Poppendieck, Mary and Tom, 27

Porter, Chris, 376, 377

post-mortems, blameless

CSG case study, 318–320

defined, 308

inviting Ops engineers to, 117, 123–124

for organizational learning, xxxix, 48–49

publishing reports of, 311–312

sample agenda, 418–419

scheduling of, 308–310

Potvin, Rachel, 149, 300, 324, 326

powerlessness, employees and, xxxvi

pretotyping, 275n

preventive security controls, 355–357

problem-solving

guided by telemetry, 225, 233–234

seeing problems as they occur, 35–36, 244

swarming, 37–41

process time, lead time versus, 9f

product owner, 83

production telemetry. See also telemetry; telemetry analysis

ATM systems and, 392

contextual inquiry, 264

in daily work, 234–235

feature toggles, 195–197, 262, 277

fix forward, 262

hand-off readiness review (HRR), 269, 270, 271f

information security and, 371–373

launch guidance, 266–267

launch readiness review (LLR), 270, 271f

LinkedIn, 237–238

pager rotation duties, 263–264

roll back, 262

service hand-back mechanism, 268f–269

site reliability engineers (SREs), 269–271

UX observation, 264–265

productivity, developer

DevOps practices for, xlif

measuring, 401–403

shared services for, 117–120

Project to Product, 12

prototypes, creating, 275n

Prugh, Scott, 72, 106–107, 181, 182, 183, 201–204, 231

psychological safety, 347

pull requests

defined, 281

evaluating, 295–296

GitHub, 281–283

Puppet Labs, xxiii, xxxix, xlif, 5, 14, 140, 185, 217

Python

Etsy, 332

ORM, 99n

PyPi for, 364

Zenoss and, 238

Q

quality controls, ineffective, 42–43

queue size, controlling, 22, 414–415f

Quicken, 274n

Quora, 278, 306

R

Rachitsky, Lenny, 420

Rajan, Karthik, 383

Rally, 94, 253, 381

Rapoport, Roy, 245, 246, 314

Rational Unified Process, 4

Raymond, Eric S., 97

rebooting servers, 225

Red Hat, xxxit, 138

Reddit, 306

Reddy, Tarum, 253

release, deployment versus, 189

Release It! Design and Deploy Production-Ready Software, 315, 347

release managers, 83

release patterns

application-based, 190, 195–199

environment-based, 190, 191–195

releases, architecture for low-risk. See architecture

releases, enabling low-risk

Andon cord and, 181

application-based release patterns, 190, 195–199

automating deployment process, 179–181

conclusion on, 206

continuous delivery and, 199–201

at CSG International, 181–183, 201–206

dark launching, 197–199

environment-based release patterns, 190, 191–195

at Etsy, 186–188, 201

at Facebook, 177–179, 201

feature toggles, 195–197

self-service deployments, 184–188

Rembetsy, Michael, 63, 226, 332

remote work, Covid-19 pandemic and, 108n

request for change (RFC) form, 380, 382

resilience engineering, defined, 316

resilience patterns, 51–52

resilient organizations

calculated risk-taking in, 314–315

CSG case study, 318–320

description of, 305

experimental model for, 313

game days in, 316–318

just, learning culture in, 307–308, 320

Netflix example, 305–307

post-mortem reports in, 311–312

retrospective meetings at, 308–310

retrospectives

CSG case study, 318–320

defined, 308

Ops engineers at, 117, 123–124

for organizational learning, xxxix, 48–49

publishing reports of, 311–312

sample agenda, 418–419

scheduling of, 308–310

Rettif, Lucas, 333

review and coordination processes

Adidas, 286–287

bureaucracy and, 42–43, 296–297

code review basics, 288–291

conclusions on, 297–298

coordination and scheduling, 288

dangers of change approval processes, 283–284

dangers of manual testing and change freezes, 292

dangers of overly controlling changes, 284–285

GitHub, 281–283

Google, 290–291

pair programming, 289, 292–294

Pivotal Labs, 293–294

pull requests, 281–283, 295–296

Rhoades, Lacy, 277–278

Rice, David, 412

Richardson, Vernon, xxxvin

Ries, Eric, 6n, 24, 195n, 411

Right Media, 259–261

risk-taking, calculated, 314–315

rituals

Dev team, 121–124

technical debt and, 336–339

Robbins, Jesse, 316, 317, 399, 410

Roberto, Michael, 313

roll back, 262

Rossi, Chuck, 177, 178, 198

Rother, Mike, 6, 49, 53, 54, 104, 411

Rouster, 383

Ruby, 152, 231n, 234, 359, 363, 364

Ruby on Rails, xxxit, 99n, 143, 191n, 362

Rugged Computing movement, 412

Rugged DevOps, 353

Rushgrove, Gareth, 358f

S

Sachs, Marcus, 371

safety

in complex systems, 33–34

myths about, 416t

safety culture, 46–49

Safety Differently: Human Factors for a New Era, 395

Salesforce.com, 383–384

Sarbanes-Oxley Act, 267

Savoia, Alberto, 275n

scenius, 54–55

Schmidt, Eric, 69

Schwaber, Ken, 122n

Schwartz, Mark, 299

Scott, Kevin, 92, 93

Scrum, 122n, 410

Scryer, 251–253

Second Way, The. See The Second Way (principles of feedback)

security. See information security

security and compliance, DevOps compatibility with, xxv

Security Chaos Engineering, 142n

Security Monkey, 420

selflessness, as cultural pillar, 75

Senge, Peter, 35, 49, 320

separation of duty, 379, 384–387

service-oriented architectures (SOAs), 109, 210

settling period, 243

Shafer, Andrew Clay, xxiv, 5, 410

Shared Operations Team (SOT), 181–182n

shared services, 117–119

Shewhart, Walter, 54

Shewhart cycle, 38, 54

Shingo, Shigeo, 27

Shinn, Bill, 389–391

Shook, John, 205

Shortridge, Kelly, 142n

Shoup, Randy, xli, 109–110, 164, 207, 208, 209f, 210, 211f, 289, 291, 311, 326

silent majority, 74

siloization, 105

silos

breaking down, 124–127

functional teams in, 125, 126f

learning, 340

Simian Army, 369n, 420

single repository of truth

as deployment pipeline foundation, 139–141, 150

for global improvement, 324–327, 334

at Google, 324–326

site reliability engineers (SREs), 269–271

Skelton, Matthew, 111, 117, 129

small batch sizes, 7, 9, 22–24f, 409

small teams, 110–111

Smart, Jonathan, 125, 341f, 403

smoke testing deployments, 180, 187

Smollen, Alex, 360, 361

smoothing, 253

Snover, Jeffrey, xxxii

Snyder, Ross, 98, 99, 100

software, COTS, 417–418

software delivery. See also deployment pipeline foundations; low-risk release, enabling

faster, xxxiiit

history of, xxx–xxxit

software supply chain, security of, 363–369

software type, myth about DevOps and, xxvi

SolarWinds security breach, 368

Sonatype Nexus Lifecycle, 368n

Sonatype State of the Software Supply Chain Report, 364, 365, 366

Sooner Safer Happier, 125, 340, 395, 403

source code repository, shared

as deployment pipeline foundation, 139–141, 150

for global improvement, 324–327, 334

at Google, 324–326

preventive security controls and, 355–357

SPACE framework, 403

Spafford, George, xliv, 225, 233n, 353n, 415f

Spear, Steven J., xxxvii, 34, 38, 49, 50, 58, 105, 305, 313, 335, 338

specialists, 105–107

Spotify, 15

spring cleanings, 337

sprint planning boards, 20-21

Sprouter, 98–100n, 108

SQL injection attacks, 373, 374f

Stack Exchange, 278, 293

stakeholder feedback, 36

standard changes

description of, 379–380

at Salesforce.com, 383–384

standups, daily, 117, 120, 122, 410

Starbucks, 15

startups, DevOps for, xxiv

State of DevOps Reports, xln, xlii, 5, 14, 15, 57, 67, 71, 103, 140, 165, 166, 175, 185, 201, 217, 227, 243, 284, 285, 312, 326, 331, 343

State of the Octoverse report, GitHub’s, 57, 368, 401

static analysis, 358f–359

statistical analysis software, Tableau, 253

StatsD metric library, 234, 235f

Stillman, Jessica, 338

Stoneham, Jim, 278–279, 280

strangler fig application pattern

at Blackboard Learn, 215–217f

blog explaining, 220

defined, 70, 208, 213–215, 218

stream-aligned teams, 112

Strear, Chris, 29–32

Sussman, Noah, 187

Sussna, Jeff, 265n

swarming

case study, 39–41

purpose of, 37–39

systems of engagement, 71, 72

systems of record, 71, 72

T

Tableau, 253

Taleb, Nassim Nicholas, 52

Tang, Diane, 276

Target

API enablement, 112–114

approval processes, 296–297

case study, 112–114, 333–334, 342, 343

internal conferences, 342, 343

recommend_tech program, 333–334

thirty-day challenge, 335–336

as traditional enterprise, 15

task switching, 28

tax collection agency, UK’s, 77–80

Team of Teams: New Rules of Engagement for a Complex World, 347

team opinion, requesting, 40

team rituals, 121–124

Team Topologies: Organizing Business and Technology Teams for Fast Flow, 111, 117, 129

teams

18F team, 369–371

Conway’s Law and, 108

dedicated team, 86–88

four types of, 111–112

functional teams in silos, 125, 126f

generalists on, 105–107

Google Web Server (GWS) team, 148–151

long-lived, 126f

platform team, 112

Shared Operations Team (SOT), 181–182n

stream-aligned, 112

two-pizza team, 110–111

in value stream, 83

technical debt

brownfield transformations and, 67

defined, xxxiv–xxxv, 171, 205

reducing, 89–93, 165

rituals to pay down, 336–339

test automation and, 161, 165

technology choices

Etsy, 332

organizational goals and, 329–330

Target, 333–334

Technology Enterprise Adoption Process (TEAP), 296, 297

technology value stream. See also value stream; value stream selection

description of, 8–12

invisible work in, 19–20

telemetry. See also production telemetry; telemetry analysis

application and business metrics, 240–242

application logging, 231–233

centralized telemetry infrastructure, 227–231

conclusions on, 244

culture of causality, 225, 233n

daily work and, 234–235

defined, 225–226

detection of problems with, 36

Etsy, 226–227

gaps, 239–240

information radiators, 236, 237, 241

infrastructure metrics, 242–243

monitoring frameworks, 227–231

Netflix, 245–246

problem-solving guided by, 233–234

security-related, 371–375

self-service access to, 236–237

self-service metrics at LinkedIn, 237–238

settling period, 243–244

StatsD metric library, 234, 235f

telemetry analysis

alerting systems, 248–249, 250f, 255f

anomaly detection techniques, 253–257

Gaussian distribution, 247f, 249

Kolmogorov-Smirnov test, 254, 255, 256f, 257

means and standard deviations, 246–248

non-Gaussian distribution, 249–251f

outlier detection, 245–246

Scryer tool at Netflix, 251–253

smoothing, 253–254

Tableau software, 253

Terhorst-North, Dan, 191f, 193, 232

test setup and run, 26

test-driven development (TDD), 159, 161, 327

testing, A/B

conclusions on, 280

feature toggles and, 196–197

history of, 275

integrating into feature planning, 278

integrating into feature testing, 276–277

integrating into our release, 277–278

need for, 273

practical guide to, 276

testing, automated

Andon cord and, 163–165

categories of tests, 155–156

conclusions on, 166

deployment pipeline infrastructure, 151–154

essential components of, 165–166

fast and reproducible tests, 166

at Google, 148–151

green build and, 154, 163, 166

ideal testing pyramid, 157f–158

need for, 147–148

non-functional requirements and, 162–163, 328

observability and, 147

performance testing environment, 161–162

reducing reliance on manual tests, 160–161

research supporting, 165–166

running tests in parallel, 158f–159

test-driven development (TDD), 159, 161, 327

testing, operations, and security, as everyone’s job, 104–105

Testing on the Toilet newsletter, 149n, 344

The First Way (principles of flow)

case study, 29–32

constraint identification, 25–27

general description of, 13f, 19

handoff reduction, 24–25

kanban boards, 4, 20f–21, 94, 124

limiting work in progress (WIP), 7, 10, 21–22

making work visible, 19–21, 93, 94, 124

reducing batch sizes, 22–24f

small batch sizes, 7, 9, 22–24f

waste and hardship elimination, 27–29

The Second Way (principles of feedback)

case study, 39–41

description of, 13f–14

feedback types and cycle times, 36–37f

optimizing for downstream work centers, 43–44

quality control closer to the source, 42–43

safety within complex systems, 33–34

seeing problems as they occur, 35–36

swarming, 37–39, 41

The Third Way (continual learning and experimentation)

case study, 54–55

conclusions on, 55–56

description of, 13f, 14

global knowledge, 51

improvement of daily work, 49–50

just culture, 47, 307–308

leader’s role, 52–54

resilience patterns, 51–52

safety culture for, 46–49

The Three Ways

case study, 15–18

defined, 12–14

high-level principles of, xliv

research supporting, 14–15

Theory of Constraints, 4, 412–413

Third Way, The. See The Third Way (continual learning and experimentation)

Thoughtworks’ Tech Radar, 57

Thrasher, Paula, 129

Three Mile Island, 34

Three Ways, The. See The Three Ways

ticket, defined, 381n

Ticketmaster/LiveNation, 104, 243

Tischler, Tim, 184

Tomayko, Ryan, 295, 296

tool-assisted code review, 290

Toyota Kata: Managing People for Improvement, Adaptiveness and Superior Results, 6, 49, 104, 411

Toyota Kata Movement, description of, 4, 6, 49, 54, 411

Toyota Production System

Andon cord, 37–39, 163, 416–417f

core belief in, 285

improvement blitz, 335

improvement kata, 6

information radiators, 236, 237, 241

Lean Movement and, 409

techniques, 4

transparent uptime, 311n, 420–421

Treynor Sloss, Ben, 269

Trimble, Chris, 86, 87

trunk-based development

adopting practices of, 172

at Bazaarvoice, 173–175, 176

benefits of, 175

conclusions on, 175–176

gated commits and, 172

for HP’s LaserJet Firmware division, 168–170

need for, 167, 171

Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing, 276

TurboTax, 274, 275

Turnbull, James, 228, 230f

Twitter

architecture, 210

experimentation, 279

Fail Whale error page, 360

static security testing, 360–363

two-pizza team, 110–111

U

Unicorn Project, The, 219, 333n

unit tests, 155

universality of the solution, xlii–xliii

urgent changes, 380

US Air Force, 69–71

US Federal Government agencies, 369–371

US Navy, 34n, 51

user acceptance testing (UAT), 152

user feedback, 36, 37f

user research, 273, 275n, 277. See also A/B testing

user stories, 328–329

UX movement, Lean, 411

UX observation, 264–265

V

value stream. See also value stream selection

defined, 7, 18

manufacturing, 7

selection of, 61, 63

technology, 8–12

value stream manager, 83

value stream mapping

%C/A metrics in, 11, 85, 95

case studies, 15–18, 91–93

conclusions on, 85

creating a map, 84–86

defined, 81, 84–86

dedicated transformation team and, 86–88, 95

example of a value stream map, 85f

goals and, 88

Lean movement and, 4

members of value stream, 83

at Nordstrom, 81–83

technical debt and, 89–93

Value Stream Mapping: How to Visualize Work and Align Leadership for Organizational Transformation, 7

value stream selection

American Airlines’ new vocabulary, 74–77t

brownfield transformation of refueling system, 69–71

case studies, 69–71, 74–77, 77–80

expanding scope of initiative, 73–77

greenfield vs. brownfield services, 66–69

innovators and early adopters, 72–74

Nordstrom’s DevOps journey, 63–66

systems of engagement, 71, 72

systems of record, 71, 72

tax collection agency’s journey, 77–80

van Kemenade, Ron, 74

Van Leeuwen, Evelijn, 343n

Vance, Ashlee, 92

Velocity Conference, 410

Verizon data breach, 364, 371

version control system, 139–141

versioned APIs, 214

Vincent, John, 247

visibility of automated test failures, 164

visibility of work, 19–21, 93, 94

Visible Ops Handbook, The, 225

Visible Ops Security, 353n

visual work boards, 20f–21. See also kanban boards

vocabulary, using a new, 76–77t

Vogels, Werner, 110, 212

Vulnerabilities and Exposures, Common (CVEs), 364, 365

Vulnerability Database, National, 368

W

Walker, Jason, 333

Wang, Kendrick, 277

WARN level, 232

waste and hardship elimination, 27–29

water-Scrum-fall anti-pattern, 165n

weak failure signals, 313–314

Westrum, Ron, 47, 48t, 57

Westrum Organizational Typology Model, 48t

Wickett, James, 353

Williams, Branden, xxv

Williams, Jeff, 412

Williams, Laurie, 159n, 293

Willis, John, xxiii–xxiv, 3, 5, 57, 406–407, 409, 410

Wolberg, Kirsten, 73n

Womack, James P., 23, 53

Wong, Bruce, 316

Wong, Eric, 238

work, visibility of, 19–21, 93, 94, 124

work in process (WIP), 7, 10, 21–22

workplace safety, at Alcoa, 50

X

Xu, Ya, 276

Y

Yadav, Yikalp, 286

Yahoo! Answers, 278–280

Yuan, Danny, 252f, 254f

Z

Zenoss, 229, 238

Zhao, Haiping, 338

ZooKeeper, 242

Zuckerberg, Mark, 338