Index
Note: Figures are indicated with f; footnotes are indicated with n; tables are indicated with t.
SYMBOLS
%C/A percent complete and accurate), 11, 85f, 95
NUMBERS
2/4 outage, 318–320
18F team, 369–371
A
A/B testing
conclusions on, 280
feature planning and, 278
feature testing and, 276–277
feature toggles and, 196–197
history of, 275
need for, 273
our release and, 277–278
practical guide to, 276
Accelerate: The Science of Lean and DevOps, xln, 14, 219
acceptance stage, 152f, 153
acceptance test-driven development (ATDD), 159
acceptance tests, 36, 155–156
accountability, as cultural pillar, 76
Adams, Keith, 338
Adidas, 286–287
Agile
DevOps as continuation of, xxv
Infrastructure Movement, 5–6, 410
Manifesto, 4–5, 410
Agile Software Development with Scrum, 122n
Aisen Seiki Global, 52
Alcoa, 50, 313
alert fatigue, 247
Algra, Ingrid, 343n
Allspaw, John, xxiii, 5, 6, 197, 234, 283, 284, 298, 308, 319, 360, 385n, 410
Amazon
architecture transformation at, 210, 212–213
continuous delivery at, 200–201
DevOps myths and, xxiv–xxv
market-oriented teams at, 102
“master of disaster” at, 316, 317, 399
post-mortem, 311n
service-oriented architecture (SOA) at, 109–110
two-pizza teams at, 110, 111
Amazon Auto Scaling (AAS), 251–252
Amazon Reboot of 2014, Great, 315–316
Amazon Web Services (AWS)
CloudWatch, 391
outage, 305–306, 315
security solutions architect at, 389
ambiguous threats, 313–314
American Airlines
brownfield transformation, 68
case study, 15–18, 74–77
new vocabulary at, 74–77t
anchoring bias, 388
Anderson, David J., 22
Andon button, 38n
Andon cord
description of, 37–39
Excella, 39–41f
for low-risk releases, 181
work stoppage and, 416–417f
anomaly detection
advanced, 255–257
anomaly detection (continued)
defined, 253
Kolmogorov-Smirnov test, 254, 255, 256f, 257
smoothing for, 253–254
Antani, Snehal, 354
antifragility, 52
APIs
cleanly defined, 214
enablement, 112–114
loosely coupled architecture and, 209
microservice-based architecture driven by, 388
self-service, 230
versioned, 214
application logging telemetry, 231–233
application performance monitors, 235n
application-based release patterns
dark launching, 190, 197–199
defined, 190
feature toggles, 190, 195–197
Arbuckle, Justin, 324, 354
Architectural Review Board (ARB), 333
architecture
Amazon, 212–213
Blackboard Learn, 215–217f
conclusions on, 218
downward spiral in, 208–209
eBay, 207–208
loosely coupled, 26–27, 209, 217
monoliths versus microservices, 210–212
overly tight, 26–27
service-oriented, 109, 210
strangler fig application pattern and, 70, 208, 213–217, 218
Architecture and Patterns for IT, 208
Art of Monitoring, The, 228, 230f
Ashman, David, 215, 216, 217f
ATM cash machines, 392
Atwood, Jeff, 170, 171, 293
Audit Defense Toolkit, DevOps, 391
auditors and compliance officers
ATM cash machines and, 392
PCI compliance at Etsy, 385–387
proving compliance, 389–391
tension between IT and, 389
Austin, Jim, 245
automated testing
Andon cord and, 163–165
categories of tests, 155–156
conclusions on, 166
constraints and, 26
deployment pipeline infrastructure, 151–154
essential components of, 165–166
fast and reproducible tests, 166
at Google, 148–151
green build and, 154, 163, 166
ideal testing pyramid, 157f–158
need for, 147–148
non-functional requirements and, 162–163, 328
observability and, 147
performance testing environment, 161–162
reducing reliance on manual tests, 160–161
research supporting, 165–166
running tests in parallel, 158f–159
test-driven development (TDD), 159, 161, 327
automation, DevOps and, xxvi
Ayers, Zack, 39, 41f
B
bad apple theory, 307
Baker, Bill, 141
banks, as IT companies, xxxv
batch sizes, reducing, 9, 22–24f, 409
Bazaarvoice, 173–175, 176
Beck, Kent, 159, 178
Beedle, Mike, 122n
Behr, Kevin, xlii, 225, 233n, 415f
Bell Labs, 54–55
Besnard, Denis, 416
Betz, Charles, 208
Beyond The Phoenix Project, 57
Big Fish Games, 115–117
big-bang approach, 73
bimodal IT, 71, 72
Blackboard Learn, 215–217f
blameless post-mortems
CSG case study, 318–320
defined, 308
inviting Ops engineers to, 117, 123–124
for organizational learning, xxxix, 48–49
publishing reports of, 311–312
sample agenda, 418–419
scheduling of, 308–310
Bland, Mike, 148, 149, 344, 345, 370
Blank, Steve, 411
Blankenship, Ed, 240
blitz, improvement, 335, 336
blue-green deployment pattern
description of, 190, 191f–192
for point-of-sale system, 193
Bohmer, Richard M. J., 313
Booch, Grady, 6n
Bosu, Biswanath, 388
bottlenecks
DevOps and, 405–406
generalists and, 106t
handoffs, queues, and, 414
Boubez, Toufic, 249, 250f, 251f, 255, 256
bounded contexts, 109
Bouwman, Jan-Joost, 343n
Brakeman, 359, 362f–363
branches, feature, 170
branching by abstraction, 214n
branching strategies, 167, 170–171
Brittain, Mike, 241f, 262f
Brook, Frederick, xli
Brooke’s Law, 286
brownfield projects
case study, 69–71
defined, 67
greenfield projects versus, 66–69
technical debt and, 67
Building Blocks, Blackboard Learn, 215–217f
Building the Future: Big Teaming for Audacious Innovation, 312
bureaucratic organizations, 42, 47, 48t
bureaucratic processes, cutting, 42–43, 296–297, 299
Burgess, Mark, 6n
burnout, decreased, xl, 15, 175
business relationship manager, 116
Buytaert, Kris, 343n
C
Cagan, Marty, 90
Campbell-Pretty, Em, 135–136
Canahuati, Pedro, 105, 263
canary release pattern, 190, 194f–195
canary release test, 177n
Capital One
biz and tech partnership at, 387–389
case study, 342–343, 387–389
Got Goo? program, 296
internal conferences, 342–343
cardholder data breaches, 364, 371
cardholder data environment (CDE), 385–386
case studies
Adidas, 286–287
Amazon, 212–213
American Airlines, 15–18, 74–77
ATM cash machines, 392
Bazaarvoice, 173–175, 176
Bell Labs, 54–55
Blackboard Learn, 215–217f
Capital One, 342–343, 387–389
CSG International, 181–183, 201–206, 318–320
Dixons Retail, 193
Etsy, 186–188, 332, 373–375, 385–387
Excella, 39–41f
Facebook, 198–199
Fannie Mae, 376–378
Federal Government agencies, 369–371
Google, 269–271, 290–291
hospital system, 29–32
hotel company, 143–144
Kessel Run refueling system, 66–69
LinkedIn, 91–93, 237–238
Nationwide Building Society, 124–127
Nationwide Insurance, 342
Netflix, 251–253
Pivotal Labs, 293–294
Salesforce.com, 383–384
Target, 112–114, 333–334, 342, 343
tax collection agency for UK, 77–80
Twitter, 360–363
Yahoo! Answers, 278–280
Chacon, Scott, 281, 282f
Chakrabarti, Arup, 263
Change, John Shook’s Model of, 205
change advisory board (CAB), 380, 382
change approval processes
case studies, 383–384, 385–387, 387–389
dangers of, 283–284
at Etsy, 385–387
normal changes, 380, 381–382
at Salesforce.com, 383–384
security and compliance in, 379–380
standard changes, 379–381, 383–384
three categories of changes, 379–380
urgent changes, 380
change control failure, 283
change freezes, 292
Chaos Gorilla, 420
Chaos Kong, 420
Chaos Monkey, 52, 55, 306–307, 315
Chapman, Brent, 347
Chapman, Janet, 124, 125, 126f
chat rooms
announcing changes with, 288
drawbacks of, 94–95
Hubot at GitHub, 321–323
organizational knowledge and, 321–322
shared goals and, 94
as watercooler, 322
Chuvakin, Anton A., 232
Clanton, Ross, 74, 75f, 76, 77t, 297, 335, 336, 343
Claudius, Jonathan, 375
Clemm, Josh, 91–93
cloud computing, five characteristics of, 330–331
cloud native, 306
Cloud System Administration, The Practice of, 325
Cloud.gov, 370, 371
cluster immune system, 190, 195n
coaching kata, 53
Cockcroft, Adrian, xxxit, 102n, 231, 296
code
infrastructure as, 6n
libraries, 356
maintainability, 326
repositories, 355–357
signing, 359–360
Code Climate, 359
code commits
automated tests on, 160, 166
daily, 172
gated commits, 172
Google, 150, 290
Pivotal Labs, 294
security and, 357
strangler fig application pattern, 215–217f
code reviews. See also change approval processes
ATM systems and, 392
change reviews versus, 289n
defined, 288
e-mail pass around, 290
forms of, 289–290
Google, 290–291
guidelines for, 288–289
learning-based culture and, 339
“over the shoulder,” 290
pair programming, 289, 292–294
Pivotal Labs, 293–294
requesting, 282
separation of duty and, 184, 384–385, 386
size of change and, 289
tool-assisted, 290
unauthorized access and, 375, 376
Codecov security breach, 368–369
Cohen, Joshua, 39, 41f
Collins, Justin, 360, 361
Columbia space shuttle, 313
commit stage, 152f, 153
Common Vulnerabilities and Exposures (CVE), 365
communities of practice, 343–345
compliance officers and auditors
ATM cash machines and, 392
Payment Card Industry Data Security Standards (PCI DSS), 385
PCI compliance at Etsy, 385–387
proving compliance, 389–391
separation of duty and, 379, 384–387
tension between IT and, 389
complicated-subsystem teams, 112
conferences
internal, 342–343
sharing experiences from DevOps, 341–342
Conformity Monkey, 420
Conrad, Ben, 78
constraint identification, 25–27
Constraints, Theory of, 4, 412–413
Consul, 242n
containers, 143–144, 152, 153n
contextual inquiry, 264
continual learning and experimentation, 3, 45–56
continuous delivery. See also deployment process; low-risk releases, enabling
continuous deployment versus, 199–201
defined, 133, 200
elite performance and, 201
infrastructure monitoring and, 243
low-risk releases and, 199–201
Continuous Delivery: Reliable Software Releases Through Build, Test, Deployment Automation, xxiv, 151, 193, 199, 219
Continuous Delivery Movement, 6, 410–411
continuous integration (CI) and testing
defined, 36, 151n
deployment pipeline infrastructure and, 151–154
continuous integration practices
Andon cord and, 163–165
at Bazaarvoice, 173–175, 176
catching errors early, 157–163
reliable automated validation test suite, 154–156
three capabilities required for, 154
trunk-based development, 167–176
convergence of DevOps, 409–412
Conway, Melvin, 97
Conway’s Law
conclusions about, 114
defined, 61, 97–98
at Etsy, 98–100, 108
organizational archetypes and, 100–101
Target case study and, 112–114
team boundaries in accordance with, 108
two-pizza team and, 110–111
Cook, Richard, 58, 319
Cook, Scott, 274
core conflict in IT operations, xxxii–xxxiii, 412f–413
Corman, Josh, 353, 363, 368, 412
Cornago, Fernando, 286
costs of IT-related deficiencies, xxxvi–xxxvii
COTS software, 417–418
counterfactual thinking, 283n, 310
COVID-19 pandemic
call centers during, 126–127
remote work and, 108n
UK’s financial support package, 77–80
Cox, Jason, 107, 119–120, 296
crowdsourcing technology governance, 333–334
CSG International
blameless post-mortem, 318–320
brownfield transformation, 68
case study, 181–183, 201–206, 318–320
daily deployments at, 181–183
generalists, 106–107
organizational transformation, 201–206
culture, organizational
high-trust, xxxix, 45, 48, 150
importance of, 46
just culture, 47, 307–308
learning-based, 48–49, 339
safety culture, 46–49
three pillars of, 75–76
three types of, 47, 48t
culture of causality, 225, 233n
Cundiff, Dan, 333
Cunningham, Ward, xxxii, 171
customer acquisition funnels, 240, 273, 275, 278
customers, two types of, 43
D
daily work, improvement of, 49–50
daily work of development
at Big Fish Games, 115–117
conclusions on, 127–128
embedded Ops engineers in, 116, 119–120
Ops liaisons in, 116, 117, 120–121
shared services in, 117–118
team rituals in, 117, 121–124
dark launch
defined, 190, 197–198
Facebook Chat, 198–199
dashboard
Adidas, 287
creating a simple, 237n
daily work and, 234
Etsy, 227
public health, 421
Data Breach Investigation Report (DBIR), 364
database changes, dealing with, 192
database syntax error, 374
Davenport, Shawn, 295n
Debois, Patrick, xxiii, 5, 6, 405–406, 410
DEBUG level, 232
dedicated release engineer, 116
defects, as waste, 28
DeGrandis, Dominica, 22, 58
Dekker, Sidney, 34, 47, 58, 307, 347, 395
(Delicate) Art of Bureaucracy, The, 299
Deming, W. Edwards, 38, 54
demonstrations, compliance by, 354
dependency scanning, 359
deploy code button, 186
Deployinator console, Etsy, 187–188f
deployment lead time, 8–11f, 409
deployment pipeline foundations
Andon cord and, 163–165
containers, 143–144, 152, 153n
defined, 6, 151, 152f
definition of “done,” 144–145
Enterprise Data Warehouse, 135–137
goal of deployment pipeline, 153
infrastructure, 151–154
on-demand creation of test environments, 137–138
rebuilding instead of repairing infrastructure, 141–143
single repository of truth, 139–141, 150, 324–327
deployment pipeline protection
ATMs and production telemetry, 392
auditors and, 389–391
Capital One, 387–389
change advisory board (CAB) and, 380, 381, 382
change approval processes and, 379–380
Etsy, 385–387
for normal changes, 380, 381–382
separation of duty, 379, 384–387
for standard changes, 379–381, 383–384
deployment process
Andon cord and, 181
automating, 179–181
continuous deployment, 199–201
CSG International, 181–183, 201–206
decouple deployments from releases, 189–199
Dixons Retail, 193
Etsy, 186–188
Facebook, 177–179
release versus, 189
deployment process (continued)
self-service deployments, 184–188
smoke testing deployments, 180, 187
deploys per day per developer, xxxviii, xxxxixf–xl
destructive testing, 383–384
Dev team rituals, 121–124
Dev tests, 36
developer productivity
DevOps practices and, xlif
measuring, 401–403
shared services for, 117–119
development, daily work of
at Big Fish Games, 115–117
conclusions on, 127–128
embedded Ops engineers in, 116, 119–120
Ops liaisons in, 116, 117, 120–121
shared services in, 117–118
team rituals in, 117, 121–124
development, hypothesis-driven
at Intuit, 273–275
need for, 273, 280
development, test-driven (TDD). See also automated testing
building incrementally with, 161
defined, 159
shared libraries and, 327
study on, 159n
development, trunk-based
adopting practices of, 172
at Bazaarvoice, 173–175, 176
benefits of, 175
conclusions on, 175–176
gated commits and, 172
for HP’s LaserJet Firmware division, 168–170
need for, 167, 171
DevOps
breaking downward spiral with, xxxv–xxxvii
business value of, xxxvii–xxxix
convergence of, 409–412
core conflict in IT operations and, xxxii–xxxiii, 412f-413
developer productivity and, xlif–xlii
downward spiral in IT and, xxxiii–xxxvii, 413t–414t
history of, 3–6
myths about, xxiv–xxv
outcomes created by, xxviii, xxix
principles underpinning, 12–14
DevOps Days. See DevOpsDays
DevOps Enterprise Summit, xxxix, 341–342, 343
DevOps journeys. See case studies
DevOpsDays, xxiii, xxiv, 6, 341, 343n, 353, 355, 410
Dickerson, Chad, 99
Dignan, Larry, 110, 111
direct response marketing, 275
Disaster Recovery Program (DiRT), Google’s, 317
Disney, 107, 119–120, 296
Dixons Retail, 193
Doctor Monkey, 420
Dojo, DevOps, 335–336
Dojos, Getting Started With, 58
“done,” modifying definition of, 144–145, 172
DORA, State of DevOps Reports from, 5, 14, 57, 103, 140, 165, 166, 175, 185, 201, 217, 227, 243, 285, 312, 326, 331, 343
downward spiral in IT
description of, xxxiii–xxxvii, 413t–414t
DevOps for breaking, xxxvii–xxxix
Drucker, Peter, 75, 80
Dunbar, Robin, 111n
Dunbar’s number, 111n
Dweck, Carol, 107
dynamic analysis, 354
E
early adopters
defined, 72f, 73
finding, 73–74
eBay, 90, 207–208
economic costs of IT-related deficiencies, xxxvi–xxxviii
Edmondson, Amy C., 312, 313, 347
Edwards, Damon, 10f, 28, 118
Eli Lilly, 312
Eltridge, Patrick, 124, 125, 126f, 127
email pass around code review, 290
emergency change advisory board (ECAB), 380
employee burnout, lower rates of, xl, 175
enabling teams, 112
Encasement Strategy, 70
Eno, Brian, 54, 55
Enterprise Data Warehouse, 135–137
envelope game, simulation of, 23, 24f
environment-based release patterns
blue-green deployments, 190, 191f–192, 193
canary releases, 190, 194f–195
cluster immune systems, 190, 195
defined, 190
ERROR level, 232
ethics of DevOps, xxxvii–xxxix
Etsy
blameless post-mortems, 310, 311, 419
brownfield transformation, 68–69
case studies, 186–188, 332, 373–375, 385–387
cluster immune system, 195n
continuous delivery at, 201
Conway’s Law and, 98–100, 108
designated Ops liaison at, 120–121
DevOps myths and, xxiv–xxv
DevOps transformation at, 226–227
experimentation, 277–278
functional orientation, 101n, 104
LAMP stack, 226
learning-based culture, 308
Morgue tool, 311–312
organizational learning, 49
PCI compliance at, 385–387
PHP run-time warning, 262f
retrospective meetings, 310, 311, 419
self-service deployment, 186–188
separation of duty, 385–387
standardizing technology stack at, 332
StatsD metric library, 234, 235f
transformation projects, 63
Evans, Eric J., 109
Evans, Jason, 338
event router, 229, 230f
Excella, Andon cord at, 39–41f
Expanding Pockets of Greatness: Spreading DevOps Horizontally in Your Organization, 129
experimentation, rapid
A/B testing, 273, 275–278
customer acquisition funnel and, 240, 273, 275, 278
at Etsy, 277–278
at Intuit, 273–275
need for, 273
at TurboTax, 274
at Yahoo! Answers, 278–280
exploratory testing, 36
Explore It!: Reduce Risk and Increase Confidence with Exploratory Testing, 36, 219
extra features, 28
extra processes, 28
F
canary release pattern, 194f–195
case study, 198–199
code deployment, 177–179
continuous delivery, 201
experimentation, 279
feedback, 105
Gatekeeper, 196n, 199
hackathon, 338
Facebook Chat, dark launch, 198–199
fail fasts, 316n
failure parties, 312
failures
blameless post-mortem at CSG, 318–320
calculated risk-taking and, 314–315
game days to rehearse, 316–318
no fear of, 55
publishing post-mortem reports, 311–312
redefining, 314–315
rehearsing and injecting, 315–316
retrospective meetings after, 308–310
weak failure signals, 313–314
fallbacks, 316n
Fannie Mae, 376–378
Farley, David, 6, 151, 152f, 156, 158f, 193, 199, 214, 410
Farley, Steve, 339, 342
Farr, Will, 295n
Farrall, Paul, 115–116
fast and reproducible tests, 166
FATAL level, 232
Fearless Organization, The, 347
feature branches, 170
feature flags. See feature toggles
feature freezes, 175
feature toggles, 195–197, 277
features
extra, 28
planning, 278
testing, 276–277, 279
user excitement and, 241f
Federal Government agencies, 369–371
feedback. See also The Second Way (principles of feedback)
Andon cord and, 37–41
customer observation, 264–265
cycle times, 37f
fast and constant, 10, 13–14, 33
optimizing for downstream work centers, 43–44
quality control closer to the source, 42–43
pager rotation duties, 263–264
principles of, 3, 13–14, 33
production telemetry and, 261–262
safe deployment of code and, 259–261
safety within complex systems, 33–34
seeing problems as they occur, 35–36, 244
self-management by developers, 265–271
stakeholder, 36
swarming, 37–41
types and cycle times, 36–37f
user, 36, 37f
Fernandez, Roberto, 73, 100
Fifth Discipline, The, 35, 49
First Way, The. See The First Way (principles of flow)
Fitz, Tim, 6, 192n, 199, 411
Five Dysfunctions of a Team: A Leadership Fable, 347
fix forward, 262
fixed mindset, 107
fixits, 345
Flickr, 197–198, 278, 360
flow, principles of, 3. See also The First Way (principles of flow)
flow metrics, 11–12
focusing steps, five, 26, 32
following work downstream, 264–265
Forsgren, Nicole, xxxix–xl, 5, 14, 140, 228f, 363, 401–403, 404
Fowler, Martin, 156, 157, 213, 214, 220, 306n
fraud, defined, 373
Fryman, James, 295n
full-stack engineer, 106, 386
functional-oriented organizations
defined, 100
DevOps outcomes in, 103–104
market orientation versus, 100–101, 103f
problems of, 101–102
funding services and products, 107–108
Furtado, Adam, 69, 70
G
Gaekwad, Karthik, 339
Galbreath, Nick, 259–261, 355, 373–375
Gall’s Law, 70
Gallimore, Jeff, 41
game days, 52, 316–318
Ganglia, 226, 229
gated commits, 172
Gatekeeper, Facebook’s, 196n, 199
Gauntlt security tool, 353, 357
Gaussian distribution, 247f, 249, 253
GE Capital, 324, 354
Geer, Dan, 368
Geinert, Levi, 333
General Electric, CEO of, xxxi–xxxii
General Motors manufacturing plant, 35, 38, 45
generalists, 105–107
generative organizations, 47–48t, 57
Gertner, Jon, 54, 55
GitHub
functional orientation, 101n, 104
Hubot, 321–323
Octoverse Report, 57, 368, 401
peer review, 281–283
pull request processes, 295–296
vulnerability timeline, 368
GitHub Flow, 282
The Goal: A Process of Ongoing Improvement, xlii, xliii, 30, 406
goals
global, 19, 21
improvement, 88
Goldratt, Eliyahu M., xxxiii, xlii, 25, 26, 32, 406
Goldratt’s Theory of Constraint, 25–26, 32
architecture, 209f–210
automated testing, 148–151
case study, 269–271, 290–291
code reviews, 290–291
continuous delivery, 200–201
DevOps myths and, xxiv–xxv
disaster recovery program, 317–318
grouplets, 344, 345
launch and hand-off readiness review, 269–271
retrospective documents, 311
service-oriented architectures (SOA), 109–110
shared source code repository, 300, 324–326
Testing on the Toilet newsletter, 149n, 344
Web Server team, 148–151
Google Cloud Datastore, 209f, 210
Got Goo? program, 296
Gothelf, Jeff, 411
Govindarajan, Vijay, 86, 87
Goyal, Rakesh, 387, 388
Grafana, 79, 234, 254, 255
Gramm-Leach-Bailey Act, 389
Graphite, 226, 234, 235f, 236, 254, 255, 322, 374f
Gray, Jim, 212
green build, 154, 163, 166
greenfield vs. brownfield services, 66–69
grouplets, 344, 345
growth mindset, 107
Gruver, Gary, 43, 148, 160, 168, 169, 170
guardrails, 79
Gupta, Prachi, 237, 238
H
Haber, Eben, 5
hackathon
defined, 337n
Facebook, 338
Hamilton, James, 306n
Hammant, Paul, 214n
Hammond, Paul, xxiii, 5, 6, 360, 410
hand back mechanism, 268–269
hand-off readiness review (HRR), 269, 270, 271f
handoffs, 20, 24–25, 414–415
hardships and waste, 27–29
healthcare organizations
generative cultures and, 47–48t
HIPAA requirements, 390–391
hospital case study, 29–32
helplessness, learned, xxxvi
Hendrickson, Elizabeth, 36, 160, 219, 293–294, 300
heroics, 10, 28–29, 170
High-Velocity Edge, The, xxxv
HIPAA, 390–391
HipHop virtual machine project (HHVM), 338
history
A/B testing, 275
DevOps, 3–6
software delivery, xxx–xxxit
HMRC tax collection agency, 77–80
Hodge, Victoria J., 245
holdouts, identifying, 74
Hollnagel, Erik, 416
Holmes, Dwayne, 143, 144
hospital system case study, 29–32
HP LaserJet, 69
HP’s LaserJet Firmware division, 168–170, 176
HSBC bank, xxxvn
Hubot, at GitHub, 321–323
Humble, Jez, xxii–xxiii, xl, 6, 151, 152f, 156, 158f, 191f, 199, 200, 207, 219, 273, 277, 284, 404–405, 406, 410
Hyatt, Matt, 78, 79, 80
hybrid schedules, 108n
hypothesis-driven development
at Intuit, 273–275
need for, 273, 280
I
Idea Factory: Bell Labs and the Great Age of American Innovation, The, 54
ideal testing pyramid, 157f–158
Imbriaco, Mark, 322
Immelt, Jeffrey, xxxi–xxxii
immersive learning opportunities, 16, 18
immutable infrastructure, 142
immutable services, 214
imposter syndrome, 148n, 310
improvement blitz, 335, 336
improvement goals, 88
improvement kata, 6
improvement of daily work, 49–50
INFO level, 232
information radiators, 236, 237, 241
information security. See also deployment pipeline protection
18F team, 369–371
bad paths, 358
Brakeman, 359, 362f–363
change approval processes, 379–380
code signing, 359–360
data breaches, 364, 368–369, 371
defect tracking, 355
dependency scanning, 359
deployment pipeline and, 357, 375–376
DevOps and, 353
dynamic analysis, 354
early-stage product demonstrations, 354
Etsy, 373–375, 385–387
Fannie Mae, 376–378
Gauntlt security tool, 353, 357
happy path, 358
Open Web Application Security Project (OWASP), 359n, 360
OWASP Dependency Check, 368n
OWASP ZAP, 358f, 359
Payment Card Industry Data Security Standards (PCI DSS), 385
post-mortems and, 355
preventive security controls, 355–357
production telemetry and, 371–373
Rugged DevOps, 353
sad and bad paths, 358
separation of duty, 379, 384–387
shared source code repositories and, 355–357
shifting security left, 376–378
silo, 353
software supply chain and, 363–369
source code integrity and code signing, 359–360
SQL injection attacks, 373, 374f
static analysis, 358f–359
Twitter case study, 360–363
Infosec. See information security
Infosec team, 83
infrastructure
centralized telemetry, 227–231
changes, 383–384
as code, 6n
deployment pipeline, 151–154
immutable, 142
metrics, 242–243
rebuilding instead of repairing, 141–143
ING technology organization, 74
innovators and early adopters
defined, 72f–73
finding, 73–74
integration tests, 156
Intuit, 273–275
IT operations
core conflict in, xxxii–xxxiii, 412f–413
developer productivity, xlif–xlii
DevOps and, xxvi
downward spiral in, xxxv–xxxix, 413t-414t
impact of DevOps on, xxxvii–xxxix
ITIL, xxvii, 139, 263n, 285, 286, 379, 380n
ITIL CMBD, 242n
J
Jacob, Adam, 6n
Jacobson, Daniel, 252f, 254f
Janitor Monkey, 420
Java
automation, 161
Bazaarvoice, 173
dependency scanning, 359
EAR and WAR files, 152
LinkedIn, 91
Maven Central, 364
ORM, 99n
JavaScript
application logging, 231n
client software level, 239
CSG, 318
eBay, 207n
Facebook, 199
libraries, 326
NPM, 364
open-source dependencies, 363
StatsD, 234
Jenkins, 152, 153, 180, 187, 205, 322, 358f, 377
JIRA, 355, 377, 381, 382, 386
Johnson, Kimberly, 376, 377–378
Jones, Angie, 161
Jones, Daniel T., 23
Joshi, Neeraj, 252f, 254f
just culture, 47, 307–308
Just Culture, 347
K
kaizen blitz, 49, 335, 336
Kalantzis, Christos, 315–316
Kanban: Successful Evolutionary Change for Your Technology Business, 22
kanban boards
example, 20f–21
Ops work on, 124
shared goals and, 94
Toyota Production System, 4
Kandogan, Eser, 5
Kanies, Luke, xxiii, 6n
Kastner, Erik, 187, 188f
Kelly, Mervin, 54, 55
Kersten, Mik, 12, 54
Kessel Run mid-air refueling system, 69–71
Kim, Gene, xxi–xxii, xl, xlii, 13f, 47n, 54, 57, 58, 225, 233n, 265, 284, 295n, 300, 353n, 364, 385n, 403–404, 406, 415f
Kissler, Courtney, 63, 64, 65, 66, 81–82, 83
Knight Capital failure, 283
Kohavi, Ron, 276, 277
Kolmogorov-Smirnov test, 254, 255, 256f, 257
Krishnan, Kripa, 317–318
Kumar, Ashish, 291f
L
laggards (skeptics), 72f, 73
LAMP stack
DevOps myth and, xxvi
Etsy, 226
large batch sizes
merges of, 170–171
small versus, 9, 22–24f, 409
Latency Monkey, 420
latent defects, 317
Lauderbach, John, 108n
launch guidance, 266–267
launch readiness review (LRR), 269, 270, 271f
Lead Architecture Review Board (LARB), 296, 297
lead time
defined, 409
focus on, 8, 18
Lean Movement and, 409
of minutes, 10–11f
process time versus, 9f
queue size and, 22, 415f
of three months, 10f
leaders
role of, 52–54
vocabulary for, 76–77t
Lean Enterprise: How High Performance Organizations Innovate at Scale, 278
Lean Manufacturing, xxx, 7, 8, 9
Lean Movement
description of, 3, 409
missing element in, 6
Lean Startup, The, 411
Lean UX movement, 411
LeanKit, 381
learned helplessness, xxxvi
learning-based culture
ASREDS learning loop, 340, 341f
communities of practice, 343–345
conclusions on, 346
DevOps conferences, 341–342
Etsy, 49
grouplets, 344, 345
importance of, 46
improvement blitz, 335, 336
internal conferences, 342–343
just culture, 47, 307–308
rituals to pay down technical debt, 336–339
Teaching Thursday, 339
thirty-day challenge at Target, 335–336
safety culture, 46–49
trust and, xli, 45, 48, 150
Leibman, Maya, 15, 16, 17, 75f, 77t
Lencioni, Patrick, 347
Lesiecki, Nick, 344
Letuchy, Eugene, 198, 199
Levenberg, Josh, 149, 300
liaisons, Ops
assigning, 117, 120–121
purpose of, 115–117
team rituals and, 121–124
two types of, 116
libraries, shared, 326, 327
Lightbody, Patrick, 263
limiting work in progress (WIP), 7, 21–22
Limoncelli, Tom, 248, 270, 271, 325–326
case study, 91–93, 237–238
former monolithic architecture, 210
Operation Inversion, 91–93
self-service metrics, 237–238
Little, Christopher, xxxv, xxvi, 94
logging infrastructure, 231n
logging telemetry, 231–233
logs
centralized, 229
in monitoring framework, 230f–231
Lonestar Application Security Conference, 353
Long, Jeremy, 365
long-lived private branches, 170
loosely coupled architecture, 26–27, 209, 217
Love, Paul, 353n
low-risk releases, architecture for
at Amazon, 212–213
at Blackboard Learn, 215–217f
conclusion on, 218
downward spiral and, 208–209
at eBay, 207–208
loosely coupled architecture, 26–27, 209, 217
monoliths versus microservices, 210–212
service-oriented architecture, 109, 210
strangler fig application pattern and, 70, 208, 213–217, 218
low-risk releases, enabling
Andon cord and, 181
application-based release patterns, 190, 195–199
automating deployment process, 179–181
conclusion on, 206
continuous delivery and, 199–201
at CSG International, 181–183, 201–206
dark launching, 197–199
environment-based release patterns, 190, 191–195
at Etsy, 186–188, 201
at Facebook, 177–179, 201
feature toggles, 195–197
self-service deployments, 184–188
Luyten, Stefan, 24f
M
Macri, Bethany, 49, 419
Magill, Stephen, 364
Maglio, Paul, 5
Making Work Visible: Exposing Time Theft to Optimize Work & Flow, 22, 58
Malpass, Ian, 227, 235f, 310
Mangot, Dave, 383, 384
manual tests
automating, 160–161
high reliance on, 10
manufacturing value stream, 7
market-oriented organizations, 100–101
market-oriented teams, 102–103
Marsh, Dianne, 118
Martens, Ryan, 94
Martin, Karen, 7, 9n, 11
mass marketing, 275
Massie, Bill, 385–387
Matatall, Neil, 360, 361
Mathew, Reena, 383, 384
matrix-oriented organizations, 100
Maurer, Dan, 274
Maven Central repository, 364, 365
Maximilien, E. Michael, 159n
McChrystal, Stanley, 347
McDonnell, Patrick, 226
McKinley, Dan, 332
mean time to recover. See MTTR
means and standard deviations, 246–248
Measuring Software Quality whitepaper, 58
Mediratta, Bharat, 149, 344
mentors, test certified, 345
Messeri, Eran, 150, 290, 325
Metasploit, 359, 369
metrics
actionable business, 240–241f
application and business, 240–242
flow, 11–12
infrastructure, 242–243
library (StatsD), 234
MTTR and, 227
production, 234–235
self-service, 236–238
Metz, Cade, 338
Mickman, Heather, 112–114, 297, 343
microservices
monoliths versus, 210–212
pros and cons of, 211t
shifting to, 212–213
Microsoft Excel, 255
Microsoft Office Framework (MOF) Study, 225
Milstein, Dan, 310
Minimum viable product (MVP), 388
mistakes. See also failures
examining, 308–310
in learning-based culture, 307–308
redefining failure, 314–315
Model of Change, John Shook’s, 205
monitoring frameworks, 227–231
monolithic architectures
defined, 210, 211t
shifting from, 212–213
Moore, Geoffrey A., 72
Morgue tool, Etsy’s, 311–312
Morrison, Erica, 183f, 204, 206, 318
motion waste, 28
MTTR
batch size and, 292
culture of causality and, 233n
daily deployments and, 183f
defined, 16
fact-based problem-solving and, 234
high performers and, 186f, 227, 228f
low-risk changes and, 381
Morgue tool for recording, 311
optimizing for, 261n
tracking metrics and, 227
Mueller, Ernest, 117n, 122, 173–175, 237
Mulkey, Jody, 104–105
Multichannel Digital Tax Platform (MDTP), 78–80
multitasking, 21–22
multivariate testing, defined, 276
MySQL, xxvi, xxxit, 226, 296, 332
Mythical Man-Month, The, xli
myths
DevOps, xxiv–xxvi
industrial safety, 416t
N
Nagappan, Nachi, 159n
NASA, 231, 313, 314
National Institute of Standards and Technology (NIST), 330, 331, 389
National Vulnerability Database, 368
Nationwide Building Society, 124–127
Nationwide Insurance, 339, 342
Naval Reactors (NR), 51
.NET, xxviii, 363
Netflix
Archaius library, 196n
auto-scaling capacity, 251–253
case study, 251–253
Chaos Monkey, 52, 55, 306–307, 315, 420
cluster immune system, 195n
DevOps myths and, xxiv–xxv
as digital-native company, 15
market-oriented teams, 102
shared services, 118
Simian Army, 369n, 420
telemetry, 245–246
Newland, Jesse, 321, 323
Nmap, 359, 369
non-functional requirements (NFRs), 89, 90f, 162–163, 328
Nordstrom, 15, 63–66
normal changes, 380, 381–382
North, Dan, 191f, 193, 232
Nygard, Michael, 315, 347
O
object relational mapping (ORM), 99n
observability, and testing, 147
Octoverse Report, 57, 368, 401
O’Donnell, Glenn, 340
O’Neill, Paul, 50, 313
Open Web Application Security Project (OWASP), 359n, 360
Operation Desert Shield, 189n
opinion of the team, 40
opinionated platform, 79
Ops engineers on service teams, embedding, 119–120
Ops liaisons
assigning, 117, 120–121
purpose of, 115–117
team rituals and, 121–124
two types of, 116
O’Reilly, Barry, 278
organizational archetypes, 100–101
organizational culture
importance of, 46
just culture, 47, 307–308
safety culture, 46–49
three pillars of, 75–76
three types of, 47, 48t
trust and, xli, 45, 48, 150
organizational goals, and technology choices, 329–330
organizational knowledge
chat rooms for capturing, 321–323
easy-to-reuse, 323–324
Ops user stories and, 328–329
recommend_tech program at Target, 333–334
source code repository for, 139–141, 150, 324–327
organizational learning
ASREDS learning loop, 340, 341f
communities of practice, 343–345
conclusions on, 346
DevOps conferences, 341–342
Etsy, 49
grouplets, 344, 345
improvement blitz, 335, 336
internal conferences, 342–343
rituals to pay down technical debt, 336–339
Teaching Thursday, 339
thirty-day challenge at Target, 335–336
ORM (object relational mapping), 99n
Orzen, Mike, 49
Osterling, Mike, 7, 9n, 11
Otto, Andreia, 287
outages
2/4 outage, 318–320
Adidas, 286
Amazon Web Services (AWS), 305–306, 315
culture of blame around, 233
culture of causality and, 225
mastering, 347
Netflix, 314–315
outlier detection, 245–246
outsourcing contracts, IT, 102n
“over the shoulder” code review, 290
OWASP (Open Web Application Security Project), 359n, 360
OWASP Dependency Check, 359, 365, 368n
OWASP ZAP, 358f, 359
Özil, Giray, 289
P
pager rotation duties, 263–264
pair programmed rollback, 164n
pair programming, 289, 292–294
pairing hours, 293n
Pais, Manuel, 111, 117, 129
Pal, Tapabrata, 296, 342–343, 353n
Pandey, Ravi, 336
Parker, Doug, 76–77
Paroski, Drew, 338
partially done work, 28
passion, as cultural piller, 75
passwords, OWASP guidance on, 360
pathological organizations, 47, 48t
Payment Card Industry Data Security Standards (PCI DSS), 385, 386
PayPal, 73n
PDCA (Plan, Do, Check, Act) cycle, 38, 54
peer review
of changes, 288–290
code review basics, 288–291
email pass around, 290
GitHub, 281–283
“over the shoulder” code review, 290
pair programming, 289, 292–294
pull requests, 281–283, 295–296
tool-assisted code review, 290
performance testing environment, 161–162
Perrow, Charles, 34
Phoenix Project, The, xxii, xlii, xxxix, 12, 404, 406, 413, 414
PHP
Conway’s Law, 98, 99
Etsy, 226, 262f, 332
Facebook, 177n, 178, 199, 338
LAMP stack and DevOps, xxvi
as representative technology, xxxit
Pinterest, 114
Pivotal Labs, 293–294
planning horizons, short, 88–89
platform teams, 112
Poppendieck, Mary and Tom, 27
Porter, Chris, 376, 377
post-mortems, blameless
CSG case study, 318–320
defined, 308
inviting Ops engineers to, 117, 123–124
for organizational learning, xxxix, 48–49
publishing reports of, 311–312
sample agenda, 418–419
scheduling of, 308–310
Potvin, Rachel, 149, 300, 324, 326
powerlessness, employees and, xxxvi
pretotyping, 275n
preventive security controls, 355–357
problem-solving
guided by telemetry, 225, 233–234
seeing problems as they occur, 35–36, 244
swarming, 37–41
process time, lead time versus, 9f
product owner, 83
production telemetry. See also telemetry; telemetry analysis
ATM systems and, 392
contextual inquiry, 264
in daily work, 234–235
feature toggles, 195–197, 262, 277
fix forward, 262
hand-off readiness review (HRR), 269, 270, 271f
information security and, 371–373
launch guidance, 266–267
launch readiness review (LLR), 270, 271f
LinkedIn, 237–238
pager rotation duties, 263–264
roll back, 262
service hand-back mechanism, 268f–269
site reliability engineers (SREs), 269–271
UX observation, 264–265
productivity, developer
DevOps practices for, xlif
measuring, 401–403
shared services for, 117–120
Project to Product, 12
prototypes, creating, 275n
Prugh, Scott, 72, 106–107, 181, 182, 183, 201–204, 231
psychological safety, 347
pull requests
defined, 281
evaluating, 295–296
GitHub, 281–283
Puppet Labs, xxiii, xxxix, xlif, 5, 14, 140, 185, 217
Python
Etsy, 332
ORM, 99n
PyPi for, 364
Zenoss and, 238
Q
quality controls, ineffective, 42–43
queue size, controlling, 22, 414–415f
Quicken, 274n
Quora, 278, 306
R
Rachitsky, Lenny, 420
Rajan, Karthik, 383
Rally, 94, 253, 381
Rapoport, Roy, 245, 246, 314
Rational Unified Process, 4
Raymond, Eric S., 97
rebooting servers, 225
Red Hat, xxxit, 138
Reddit, 306
Reddy, Tarum, 253
release, deployment versus, 189
Release It! Design and Deploy Production-Ready Software, 315, 347
release managers, 83
release patterns
application-based, 190, 195–199
environment-based, 190, 191–195
releases, architecture for low-risk. See architecture
releases, enabling low-risk
Andon cord and, 181
application-based release patterns, 190, 195–199
automating deployment process, 179–181
conclusion on, 206
continuous delivery and, 199–201
at CSG International, 181–183, 201–206
dark launching, 197–199
environment-based release patterns, 190, 191–195
at Etsy, 186–188, 201
at Facebook, 177–179, 201
feature toggles, 195–197
self-service deployments, 184–188
Rembetsy, Michael, 63, 226, 332
remote work, Covid-19 pandemic and, 108n
request for change (RFC) form, 380, 382
resilience engineering, defined, 316
resilience patterns, 51–52
resilient organizations
calculated risk-taking in, 314–315
CSG case study, 318–320
description of, 305
experimental model for, 313
game days in, 316–318
just, learning culture in, 307–308, 320
Netflix example, 305–307
post-mortem reports in, 311–312
retrospective meetings at, 308–310
retrospectives
CSG case study, 318–320
defined, 308
Ops engineers at, 117, 123–124
for organizational learning, xxxix, 48–49
publishing reports of, 311–312
sample agenda, 418–419
scheduling of, 308–310
Rettif, Lucas, 333
review and coordination processes
Adidas, 286–287
bureaucracy and, 42–43, 296–297
code review basics, 288–291
conclusions on, 297–298
coordination and scheduling, 288
dangers of change approval processes, 283–284
dangers of manual testing and change freezes, 292
dangers of overly controlling changes, 284–285
GitHub, 281–283
Google, 290–291
pair programming, 289, 292–294
Pivotal Labs, 293–294
pull requests, 281–283, 295–296
Rhoades, Lacy, 277–278
Rice, David, 412
Richardson, Vernon, xxxvin
Ries, Eric, 6n, 24, 195n, 411
Right Media, 259–261
risk-taking, calculated, 314–315
rituals
Dev team, 121–124
technical debt and, 336–339
Robbins, Jesse, 316, 317, 399, 410
Roberto, Michael, 313
roll back, 262
Rossi, Chuck, 177, 178, 198
Rother, Mike, 6, 49, 53, 54, 104, 411
Rouster, 383
Ruby, 152, 231n, 234, 359, 363, 364
Ruby on Rails, xxxit, 99n, 143, 191n, 362
Rugged Computing movement, 412
Rugged DevOps, 353
Rushgrove, Gareth, 358f
S
Sachs, Marcus, 371
safety
in complex systems, 33–34
myths about, 416t
safety culture, 46–49
Safety Differently: Human Factors for a New Era, 395
Salesforce.com, 383–384
Sarbanes-Oxley Act, 267
Savoia, Alberto, 275n
scenius, 54–55
Schmidt, Eric, 69
Schwaber, Ken, 122n
Schwartz, Mark, 299
Scott, Kevin, 92, 93
Scrum, 122n, 410
Scryer, 251–253
Second Way, The. See The Second Way (principles of feedback)
security. See information security
security and compliance, DevOps compatibility with, xxv
Security Chaos Engineering, 142n
Security Monkey, 420
selflessness, as cultural pillar, 75
Senge, Peter, 35, 49, 320
separation of duty, 379, 384–387
service-oriented architectures (SOAs), 109, 210
settling period, 243
Shafer, Andrew Clay, xxiv, 5, 410
Shared Operations Team (SOT), 181–182n
shared services, 117–119
Shewhart, Walter, 54
Shewhart cycle, 38, 54
Shingo, Shigeo, 27
Shinn, Bill, 389–391
Shook, John, 205
Shortridge, Kelly, 142n
Shoup, Randy, xli, 109–110, 164, 207, 208, 209f, 210, 211f, 289, 291, 311, 326
silent majority, 74
siloization, 105
silos
breaking down, 124–127
functional teams in, 125, 126f
learning, 340
Simian Army, 369n, 420
single repository of truth
as deployment pipeline foundation, 139–141, 150
for global improvement, 324–327, 334
at Google, 324–326
site reliability engineers (SREs), 269–271
Skelton, Matthew, 111, 117, 129
small batch sizes, 7, 9, 22–24f, 409
small teams, 110–111
Smart, Jonathan, 125, 341f, 403
smoke testing deployments, 180, 187
Smollen, Alex, 360, 361
smoothing, 253
Snover, Jeffrey, xxxii
Snyder, Ross, 98, 99, 100
software, COTS, 417–418
software delivery. See also deployment pipeline foundations; low-risk release, enabling
faster, xxxiiit
history of, xxx–xxxit
software supply chain, security of, 363–369
software type, myth about DevOps and, xxvi
SolarWinds security breach, 368
Sonatype Nexus Lifecycle, 368n
Sonatype State of the Software Supply Chain Report, 364, 365, 366
Sooner Safer Happier, 125, 340, 395, 403
source code repository, shared
as deployment pipeline foundation, 139–141, 150
for global improvement, 324–327, 334
at Google, 324–326
preventive security controls and, 355–357
SPACE framework, 403
Spafford, George, xliv, 225, 233n, 353n, 415f
Spear, Steven J., xxxvii, 34, 38, 49, 50, 58, 105, 305, 313, 335, 338
specialists, 105–107
Spotify, 15
spring cleanings, 337
sprint planning boards, 20-21
Sprouter, 98–100n, 108
SQL injection attacks, 373, 374f
Stack Exchange, 278, 293
stakeholder feedback, 36
standard changes
description of, 379–380
at Salesforce.com, 383–384
standups, daily, 117, 120, 122, 410
Starbucks, 15
startups, DevOps for, xxiv
State of DevOps Reports, xln, xlii, 5, 14, 15, 57, 67, 71, 103, 140, 165, 166, 175, 185, 201, 217, 227, 243, 284, 285, 312, 326, 331, 343
State of the Octoverse report, GitHub’s, 57, 368, 401
static analysis, 358f–359
statistical analysis software, Tableau, 253
StatsD metric library, 234, 235f
Stillman, Jessica, 338
Stoneham, Jim, 278–279, 280
strangler fig application pattern
at Blackboard Learn, 215–217f
blog explaining, 220
defined, 70, 208, 213–215, 218
stream-aligned teams, 112
Strear, Chris, 29–32
Sussman, Noah, 187
Sussna, Jeff, 265n
swarming
case study, 39–41
purpose of, 37–39
systems of engagement, 71, 72
systems of record, 71, 72
T
Tableau, 253
Taleb, Nassim Nicholas, 52
Tang, Diane, 276
Target
API enablement, 112–114
approval processes, 296–297
case study, 112–114, 333–334, 342, 343
internal conferences, 342, 343
recommend_tech program, 333–334
thirty-day challenge, 335–336
as traditional enterprise, 15
task switching, 28
tax collection agency, UK’s, 77–80
Team of Teams: New Rules of Engagement for a Complex World, 347
team opinion, requesting, 40
team rituals, 121–124
Team Topologies: Organizing Business and Technology Teams for Fast Flow, 111, 117, 129
teams
18F team, 369–371
Conway’s Law and, 108
dedicated team, 86–88
four types of, 111–112
functional teams in silos, 125, 126f
generalists on, 105–107
Google Web Server (GWS) team, 148–151
long-lived, 126f
platform team, 112
Shared Operations Team (SOT), 181–182n
stream-aligned, 112
two-pizza team, 110–111
in value stream, 83
technical debt
brownfield transformations and, 67
defined, xxxiv–xxxv, 171, 205
reducing, 89–93, 165
rituals to pay down, 336–339
test automation and, 161, 165
technology choices
Etsy, 332
organizational goals and, 329–330
Target, 333–334
Technology Enterprise Adoption Process (TEAP), 296, 297
technology value stream. See also value stream; value stream selection
description of, 8–12
invisible work in, 19–20
telemetry. See also production telemetry; telemetry analysis
application and business metrics, 240–242
application logging, 231–233
centralized telemetry infrastructure, 227–231
conclusions on, 244
culture of causality, 225, 233n
daily work and, 234–235
defined, 225–226
detection of problems with, 36
Etsy, 226–227
gaps, 239–240
information radiators, 236, 237, 241
infrastructure metrics, 242–243
monitoring frameworks, 227–231
Netflix, 245–246
problem-solving guided by, 233–234
security-related, 371–375
self-service access to, 236–237
self-service metrics at LinkedIn, 237–238
settling period, 243–244
StatsD metric library, 234, 235f
telemetry analysis
alerting systems, 248–249, 250f, 255f
anomaly detection techniques, 253–257
Gaussian distribution, 247f, 249
Kolmogorov-Smirnov test, 254, 255, 256f, 257
means and standard deviations, 246–248
non-Gaussian distribution, 249–251f
outlier detection, 245–246
Scryer tool at Netflix, 251–253
smoothing, 253–254
Tableau software, 253
Terhorst-North, Dan, 191f, 193, 232
test setup and run, 26
test-driven development (TDD), 159, 161, 327
testing, A/B
conclusions on, 280
feature toggles and, 196–197
history of, 275
integrating into feature planning, 278
integrating into feature testing, 276–277
integrating into our release, 277–278
need for, 273
practical guide to, 276
testing, automated
Andon cord and, 163–165
categories of tests, 155–156
conclusions on, 166
deployment pipeline infrastructure, 151–154
essential components of, 165–166
fast and reproducible tests, 166
at Google, 148–151
green build and, 154, 163, 166
ideal testing pyramid, 157f–158
need for, 147–148
non-functional requirements and, 162–163, 328
observability and, 147
performance testing environment, 161–162
reducing reliance on manual tests, 160–161
research supporting, 165–166
running tests in parallel, 158f–159
test-driven development (TDD), 159, 161, 327
testing, operations, and security, as everyone’s job, 104–105
Testing on the Toilet newsletter, 149n, 344
The First Way (principles of flow)
case study, 29–32
constraint identification, 25–27
general description of, 13f, 19
handoff reduction, 24–25
kanban boards, 4, 20f–21, 94, 124
limiting work in progress (WIP), 7, 10, 21–22
making work visible, 19–21, 93, 94, 124
reducing batch sizes, 22–24f
small batch sizes, 7, 9, 22–24f
waste and hardship elimination, 27–29
The Second Way (principles of feedback)
case study, 39–41
description of, 13f–14
feedback types and cycle times, 36–37f
optimizing for downstream work centers, 43–44
quality control closer to the source, 42–43
safety within complex systems, 33–34
seeing problems as they occur, 35–36
swarming, 37–39, 41
The Third Way (continual learning and experimentation)
case study, 54–55
conclusions on, 55–56
description of, 13f, 14
global knowledge, 51
improvement of daily work, 49–50
just culture, 47, 307–308
leader’s role, 52–54
resilience patterns, 51–52
safety culture for, 46–49
The Three Ways
case study, 15–18
defined, 12–14
high-level principles of, xliv
research supporting, 14–15
Theory of Constraints, 4, 412–413
Third Way, The. See The Third Way (continual learning and experimentation)
Thoughtworks’ Tech Radar, 57
Thrasher, Paula, 129
Three Mile Island, 34
Three Ways, The. See The Three Ways
ticket, defined, 381n
Ticketmaster/LiveNation, 104, 243
Tischler, Tim, 184
Tomayko, Ryan, 295, 296
tool-assisted code review, 290
Toyota Kata: Managing People for Improvement, Adaptiveness and Superior Results, 6, 49, 104, 411
Toyota Kata Movement, description of, 4, 6, 49, 54, 411
Toyota Production System
Andon cord, 37–39, 163, 416–417f
core belief in, 285
improvement blitz, 335
improvement kata, 6
information radiators, 236, 237, 241
Lean Movement and, 409
techniques, 4
transparent uptime, 311n, 420–421
Treynor Sloss, Ben, 269
Trimble, Chris, 86, 87
trunk-based development
adopting practices of, 172
at Bazaarvoice, 173–175, 176
benefits of, 175
conclusions on, 175–176
gated commits and, 172
for HP’s LaserJet Firmware division, 168–170
need for, 167, 171
Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing, 276
TurboTax, 274, 275
Turnbull, James, 228, 230f
architecture, 210
experimentation, 279
Fail Whale error page, 360
static security testing, 360–363
two-pizza team, 110–111
U
Unicorn Project, The, 219, 333n
unit tests, 155
universality of the solution, xlii–xliii
urgent changes, 380
US Air Force, 69–71
US Federal Government agencies, 369–371
US Navy, 34n, 51
user acceptance testing (UAT), 152
user feedback, 36, 37f
user research, 273, 275n, 277. See also A/B testing
user stories, 328–329
UX movement, Lean, 411
UX observation, 264–265
V
value stream. See also value stream selection
defined, 7, 18
manufacturing, 7
selection of, 61, 63
technology, 8–12
value stream manager, 83
value stream mapping
%C/A metrics in, 11, 85, 95
case studies, 15–18, 91–93
conclusions on, 85
creating a map, 84–86
defined, 81, 84–86
dedicated transformation team and, 86–88, 95
example of a value stream map, 85f
goals and, 88
Lean movement and, 4
members of value stream, 83
at Nordstrom, 81–83
technical debt and, 89–93
Value Stream Mapping: How to Visualize Work and Align Leadership for Organizational Transformation, 7
value stream selection
American Airlines’ new vocabulary, 74–77t
brownfield transformation of refueling system, 69–71
case studies, 69–71, 74–77, 77–80
expanding scope of initiative, 73–77
greenfield vs. brownfield services, 66–69
innovators and early adopters, 72–74
Nordstrom’s DevOps journey, 63–66
systems of engagement, 71, 72
systems of record, 71, 72
tax collection agency’s journey, 77–80
van Kemenade, Ron, 74
Van Leeuwen, Evelijn, 343n
Vance, Ashlee, 92
Velocity Conference, 410
Verizon data breach, 364, 371
version control system, 139–141
versioned APIs, 214
Vincent, John, 247
visibility of automated test failures, 164
visibility of work, 19–21, 93, 94
Visible Ops Handbook, The, 225
Visible Ops Security, 353n
visual work boards, 20f–21. See also kanban boards
vocabulary, using a new, 76–77t
Vogels, Werner, 110, 212
Vulnerabilities and Exposures, Common (CVEs), 364, 365
Vulnerability Database, National, 368
W
Walker, Jason, 333
Wang, Kendrick, 277
WARN level, 232
waste and hardship elimination, 27–29
water-Scrum-fall anti-pattern, 165n
weak failure signals, 313–314
Westrum, Ron, 47, 48t, 57
Westrum Organizational Typology Model, 48t
Wickett, James, 353
Williams, Branden, xxv
Williams, Jeff, 412
Williams, Laurie, 159n, 293
Willis, John, xxiii–xxiv, 3, 5, 57, 406–407, 409, 410
Wolberg, Kirsten, 73n
Womack, James P., 23, 53
Wong, Bruce, 316
Wong, Eric, 238
work, visibility of, 19–21, 93, 94, 124
work in process (WIP), 7, 10, 21–22
workplace safety, at Alcoa, 50
X
Xu, Ya, 276
Y
Yadav, Yikalp, 286
Yahoo! Answers, 278–280
Yuan, Danny, 252f, 254f
Z
Zenoss, 229, 238
Zhao, Haiping, 338
ZooKeeper, 242
Zuckerberg, Mark, 338