This appendix defines a selection of key terms from throughout the book.
A function that transforms the output of a neuron in an artificial neural network, generally to render it capable of handling nonlinear transformations or to ensure its output value is clamped within some range (chapter 7).
A graph with no cycles (chapter 4).
A heuristic for the A* search algorithm that never overestimates the cost to reach the goal (chapter 2).
A simulation of a biological neural network using computational tools to solve problems not easily reduced into forms amenable to traditional algorithmic approaches. Note that the operation of an artificial neural network generally strays significantly from its biological counterpart (chapter 7).
A version of memoization implemented at the language level, in which the results of function calls without side effects are stored for lookup upon further identical calls (chapter 1).
A technique used for training neural network weights according to a set of inputs with known-correct outputs. Partial derivatives are used to calculate each weight’s “responsibility” for the error between actual results and expected results. These deltas are used to update the weights for future runs (chapter 7).
Returning to an earlier decision point (to go a different direction than was last pursued) after hitting a wall in a search problem (chapter 3).
A data structure that stores a sequence of 1s and 0s represented using a single bit of memory for each. This is sometimes referred to as a bit vector or bit array (chapter 1).
The center point in a cluster. Typically, each dimension of this point is the mean of the rest of the points in that dimension (chapter 6).
In a genetic algorithm, each individual in the population is referred to as a chromosome (chapter 5).
See clustering (chapter 6).
An unsupervised learning technique that divides a data set into groups of related points, known as clusters (chapter 6).
A combination of three nucleotides that form an amino acid (chapter 2).
Encoding data (changing its form) to require less space (chapter 1).
A graph property that indicates there is a path from any vertex to any other vertex (chapter 4).
A requirement that must be fulfilled in order for a constraint-satisfaction problem to be solved (chapter 3).
In a genetic algorithm, combining individuals from the population to create offspring that are a mixture of the parents and that will be a part of the next generation (chapter 5).
A text interchange format in which rows of data sets have their values separated by commas, and the rows themselves are generally separated by newline characters. CSV stands for comma-separated values. CSV is a common export format from spreadsheets and databases (chapter 7).
A path in a graph that visits the same vertex twice without backtracking (chapter 4).
Reversing the process of compression, returning the data to its original form (chapter 1).
Something of a buzzword, deep learning can refer to any of several techniques that use advanced machine-learning algorithms to analyze big data. Most commonly, deep learning refers to using multilayer artificial neural networks to solve problems using large data sets (chapter 7).
A value that is representative of a gap between the expected value of a weight in a neural network and its actual value. The expected value is determined through the use of training data and backpropagation (chapter 7).
See directed graph (chapter 4).
Also known as a digraph, a directed graph is a graph in which edges may only be traversed in one direction (chapter 4).
The possible values of a variable in a constraint-satisfaction problem (chapter 3).
Instead of solving a large problem outright using a brute-force approach, in dynamic programming the problem is broken up into smaller subproblems that are each more manageable (chapter 9).
A connection between two vertices (nodes) in a graph (chapter 4).
See XOR (chapter 1).
A type of neural network in which signals propagate in one direction (chapter 7).
A function that evaluates the effectiveness of a potential solution to a problem (chapter 5).
One round in the evaluation of a genetic algorithm; also used to refer to the population of individuals active in a round (chapter 5).
Programs that modify themselves using the selection, crossover, and mutation operators to find solutions to programming problems that are nonobvious (chapter 5).
The method of modifying an artificial neural network’s weights using the deltas calculated during backpropagation and the learning rate (chapter 7).
An abstract mathematical construct that is used for modeling a real-world problem by dividing the problem into a set of connected nodes. The nodes are known as vertices, and the connections are known as edges (chapter 4).
An algorithm that always selects the best immediate choice at any decision point, hopeful that it will lead to the globally optimal solution (chapter 4).
An intuition about the way to solve a problem that points in the right direction (chapter 2).
Any layers between the input layer and the output layer in a feed-forward artificial neural network (chapter 7).
A loop that does not terminate (chapter 1).
A set of recursive calls that does not terminate but instead continues to make additional recursive calls. Analogous to an infinite loop. Usually caused by the lack of a base case (chapter 1).
The first layer of a feed-forward artificial neural network that receives its input from some kind of external entity (chapter 7).
A value, usually a constant, used to adjust the rate at which weights are modified in an artificial neural network, based on calculated deltas (chapter 7).
A technique in which the results of computational tasks are stored for later retrieval from memory, saving additional computation time to re-create the same results (chapter 1).
A spanning tree that connects all vertices using the minimum total weight of edges (chapter 4).
In a genetic algorithm, randomly changing some property of an individual before it is included in the next generation (chapter 5).
The evolutionary process by which well-adapted organisms succeed and poorly adapted organisms fail. Given a limited set of resources in the environment, the organisms best suited to leverage those resources will survive and propagate. Over several generations, this leads to helpful traits being propagated amongst a population, hence being naturally selected by the constraints of the environment (chapter 5).
A network of multiple neurons that act in concert to process information. Neurons are often thought about as being organized in layers (chapter 7).
An individual nerve cell, such as those in the human brain (chapter 7).
The process of making different types of data comparable (chapter 6).
A problem that belongs to a class of problems for which there is no known polynomial time algorithm to solve (chapter 9).
One instance of one of the four bases of DNA: adenine (A), cytosine (C), guanine (G), and thymine (T) (chapter 2).
The last layer in a feed-forward artificial neural network that is used for determining the result of the network for a given input and problem (chapter 7).
A set of edges that connects two vertices in a graph (chapter 4).
A turn (often thought of as a move) in a two-player game (chapter 8).
In a genetic algorithm, the population is the collection of individuals (each representing a potential solution to the problem) competing to solve the problem (chapter 5).
A data structure that pops items based on a “priority” ordering. For instance, a priority queue may be used with a collection of emergency calls in order to respond to the highest-priority calls first (chapter 2).
An abstract data structure that enforces the ordering FIFO (First-In-First-Out). A queue implementation provides at least the operations push and pop for adding and removing elements, respectively (chapter 2).
A function that calls itself (chapter 1).
The process of selecting individuals in a generation of a genetic algorithm for reproduction to create individuals for the next generation (chapter 5).
One of a set of popular activation functions used in artificial neural networks. The eponymous sigmoid function always returns a value between 0 and 1. It is also useful for ensuring that results beyond just linear transformations can be represented by the network (chapter 7).
Microprocessor instructions optimized for doing calculations using vectors, also sometimes known as vector instructions. SIMD stands for single instruction, multiple data (chapter 7).
A tree that connects every vertex in a graph (chapter 4).
An abstract data structure that enforces the Last-In-First-Out (LIFO) ordering. A stack implementation provides at least the operations push and pop for adding and removing elements, respectively (chapter 2).
Any machine-learning technique in which the algorithm is somehow guided toward correct results using outside resources (chapter 7).
Gaps between neurons in which neurotransmitters are released to allow for the conduction of electrical current. In layman’s terms, these are the connections between neurons (chapter 7).
A phase in which an artificial neural network has its weights adjusted by using backpropagation with known-correct outputs for some given inputs (chapter 7).
A graph that has only one path between any two vertices. A tree is acyclic (chapter 4).
Any machine-learning technique that does not use fore-knowledge to reach its conclusions—in other words, a technique that is not guided but instead runs on its own (chapter 6).
In the context of a constraint-satisfaction problem, a variable is some parameter that must be solved for as part of the problem’s solution. The possible values of the variable are its domain. The requirements for a solution are one or more constraints (chapter 3).
A single node in a graph (chapter 4).
A logical bitwise operation that returns true when either of its operands is true but not when both are true or neither is true. The abbreviation stands for exclusive or. In Python, the ^ operator is used for XOR (chapter 1).
The number of standard deviations a data point is away from the mean of a data set (chapter 6).