A lot of people use Python as a replacement for shell scripts, using it to automate common system tasks, such as manipulating files, configuring systems, and so forth. The main goal of this chapter is to describe features related to common tasks encountered when writing scripts. For example, parsing command-line options, manipulating files on the filesystem, getting useful system configuration data, and so forth. Chapter 5 also contains general information related to files and directories.
You want a script you’ve written to be able to accept input using whatever mechanism is easiest for the user. This should include piping output from a command to the script, redirecting a file into the script, or just passing a filename, or list of filenames, to the script on the command line.
Python’s built-in fileinput module makes this very simple and concise. If you
have a script that looks like this:
#!/usr/bin/env python3importfileinputwithfileinput.input()asf_input:forlineinf_input:(line,end='')
Then you can already accept input to the script in all of the previously mentioned ways. If you save this script as filein.py and make it executable, you can do all of the following and get the expected output:
$ls | ./filein.py# Prints a directory listing to stdout.$./filein.py /etc/passwd# Reads /etc/passwd to stdout.$./filein.py < /etc/passwd# Reads /etc/passwd to stdout.
The fileinput.input() function creates and returns an instance of the
FileInput class. In addition to containing a few handy helper methods, the
instance can also be used as a context manager. So, to put all of this together,
if we wrote a script that expected to be printing output from several files at
once, we might have it include the filename and line number in the output,
like this:
>>>importfileinput>>>withfileinput.input('/etc/passwd')asf:>>>forlineinf:...(f.filename(),f.lineno(),line,end='').../etc/passwd1##/etc/passwd2# User Database/etc/passwd3#<otheroutputomitted>
Using it as a context manager ensures that the file is closed when it’s no
longer being used, and we leveraged a few handy FileInput helper methods here
to get some extra information in the output.
You want your program to terminate by printing a message to standard error and returning a nonzero status code.
To have a program terminate in this manner, raise a
SystemExit exception, but supply the error message as an
argument. For example:
raiseSystemExit('It failed!')
This will cause the supplied message to be printed to
sys.stderr and the program to exit with a status code
of 1.
This is a small recipe, but it solves a common problem that arises when writing scripts. Namely, to terminate a program, you might be inclined to write code like this:
importsyssys.stderr.write('It failed!\n')raiseSystemExit(1)
None of the extra steps involving import or writing to sys.stderr
are neccessary if you simply supply the message to SystemExit() instead.
You want to write a program that parses options supplied on the command line
(found in sys.argv).
The argparse module can be used to parse command-line options.
A simple example will help to illustrate the essential features:
# search.py'''Hypothetical command-line tool for searching a collection offiles for one or more text patterns.'''importargparseparser=argparse.ArgumentParser(description='Search some files')parser.add_argument(dest='filenames',metavar='filename',nargs='*')parser.add_argument('-p','--pat',metavar='pattern',required=True,dest='patterns',action='append',help='text pattern to search for')parser.add_argument('-v',dest='verbose',action='store_true',help='verbose mode')parser.add_argument('-o',dest='outfile',action='store',help='output file')parser.add_argument('--speed',dest='speed',action='store',choices={'slow','fast'},default='slow',help='search speed')args=parser.parse_args()# Output the collected arguments(args.filenames)(args.patterns)(args.verbose)(args.outfile)(args.speed)
This program defines a command-line parser with the following usage:
bash % python3 search.py -h
usage: search.py [-h] [-p pattern] [-v] [-o OUTFILE] [--speed {slow,fast}]
[filename [filename ...]]Search some files
positional arguments: filename
optional arguments:
-h, --help show this help message and exit
-p pattern, --pat pattern
text pattern to search for
-v verbose mode
-o OUTFILE output file
--speed {slow,fast} search speedThe following session shows how data shows up in the program.
Carefully observe the output of the print() statements.
bash % python3 search.py foo.txt bar.txt
usage: search.py [-h] -p pattern [-v] [-o OUTFILE] [--speed {fast,slow}]
[filename [filename ...]]
search.py: error: the following arguments are required: -p/--patbash % python3 search.py -v -p spam --pat=eggs foo.txt bar.txt filenames = ['foo.txt', 'bar.txt'] patterns = ['spam', 'eggs'] verbose = True outfile = None speed = slow
bash % python3 search.py -v -p spam --pat=eggs foo.txt bar.txt -o results filenames = ['foo.txt', 'bar.txt'] patterns = ['spam', 'eggs'] verbose = True outfile = results speed = slow
bash % python3 search.py -v -p spam --pat=eggs foo.txt bar.txt -o results \
--speed=fast
filenames = ['foo.txt', 'bar.txt']
patterns = ['spam', 'eggs']
verbose = True
outfile = results
speed = fastFurther processing of the options is up to the program. Replace the print() functions
with something more interesting.
The argparse module is one of the largest modules in the standard library, and has a
huge number of configuration options. This recipe shows an essential
subset that can be used and extended to get started.
To parse options, you first create an ArgumentParser instance and
add declarations for the options you want to support it using the
add_argument() method. In each add_argument() call, the dest
argument specifies the name of an attribute where the result of
parsing will be placed. The metavar argument is used when generating
help messages. The action argument specifies the processing
associated with the argument and is often store for storing a value
or append for collecting multiple argument values into a list.
The following argument collects all of the extra command-line arguments into a list. It’s being used to make a list of filenames in the example:
parser.add_argument(dest='filenames',metavar='filename',nargs='*')
The following argument sets a Boolean flag depending on whether or not the argument was provided:
parser.add_argument('-v',dest='verbose',action='store_true',help='verbose mode')
The following argument takes a single value and stores it as a string:
parser.add_argument('-o',dest='outfile',action='store',help='output file')
The following argument specification allows an argument to be repeated multiple times and
all of the values append into a list. The required flag means that the
argument must be supplied at least once. The use of -p and --pat mean that either
argument name is acceptable.
parser.add_argument('-p','--pat',metavar='pattern',required=True,dest='patterns',action='append',help='text pattern to search for')
Finally, the following argument specification takes a value, but checks it against a set of possible choices.
parser.add_argument('--speed',dest='speed',action='store',choices={'slow','fast'},default='slow',help='search speed')
Once the options have been given, you simply execute the parser.parse() method. This
will process the sys.argv value and return an instance with the results. The results
for each argument are placed into an attribute with the name given in the dest
parameter to add_argument().
There are several other approaches for parsing command-line options. For example,
you might be inclined to manually process sys.argv yourself or use the getopt module (which is modeled after a similarly named C library). However, if you take this approach,
you’ll simply end up replicating much of the code that argparse already provides.
You may also encounter code that uses the optparse library to parse options. Although
optparse is very similar to argparse, the latter is more modern and should
be preferred in new projects.
You’ve written a script that requires a password, but since the script is meant for interactive use, you’d like to prompt the user for a password rather than hardcode it into the script.
Python’s getpass module is precisely what you need in this situation. It will
allow you to very easily prompt for a password without having the keyed-in
password displayed on the user’s terminal. Here’s how it’s done:
importgetpassuser=getpass.getuser()passwd=getpass.getpass()ifsvc_login(user,passwd):# You must write svc_login()('Yay!')else:('Boo!')
In this code, the svc_login() function is code that you must write to
further process the password entry. Obviously, the exact handling is
application-specific.
Note in the preceding code that getpass.getuser() doesn’t prompt the user for
their username. Instead, it uses the current user’s login name, according to the
user’s shell environment, or as a last resort, according to the local system’s
password database (on platforms that support the pwd module).
If you want to explicitly prompt the user for their username, which can be more
reliable, use the built-in input function:
user=input('Enter your username: ')
It’s also important to remember that some systems may not support the hiding of
the typed password input to the getpass() method. In this case, Python does
all it can to forewarn you of problems (i.e., it alerts you that passwords will be shown in
cleartext) before moving on.
Use the os.get_terminal_size() function to do this:
>>>importos>>>sz=os.get_terminal_size()>>>szos.terminal_size(columns=80, lines=24)>>>sz.columns80>>>sz.lines24>>>
There are many other possible approaches for obtaining the terminal
size, ranging from reading environment variables to executing low-level
system calls involving ioctl() and TTYs. Frankly, why would you
bother with that when this one simple call will suffice?
Use the subprocess.check_output() function. For example:
importsubprocessout_bytes=subprocess.check_output(['netstat','-a'])
This runs the specified command and returns its output as a byte string. If you need to interpret the resulting bytes as text, add a further decoding step. For example:
out_text=out_bytes.decode('utf-8')
If the executed command returns a nonzero exit code, an exception is raised. Here is an example of catching errors and getting the output created along with the exit code:
try:out_bytes=subprocess.check_output(['cmd','arg1','arg2'])exceptsubprocess.CalledProcessErrorase:out_bytes=e.output# Output generated before errorcode=e.returncode# Return code
By default, check_output() only returns output written to standard output.
If you want both standard output and error collected, use the stderr argument:
out_bytes=subprocess.check_output(['cmd','arg1','arg2'],stderr=subprocess.STDOUT)
If you need to execute a command with a timeout, use the timeout argument:
try:out_bytes=subprocess.check_output(['cmd','arg1','arg2'],timeout=5)exceptsubprocess.TimeoutExpiredase:...
Normally, commands are executed without the assistance of an
underlying shell (e.g., sh, bash, etc.). Instead, the list of
strings supplied are given to a low-level system command, such as
os.execve(). If you want the command to be interpreted by a shell,
supply it using a simple string and give the shell=True argument.
This is sometimes useful if you’re trying to get Python to execute a
complicated shell command involving pipes, I/O redirection, and other
features. For example:
out_bytes=subprocess.check_output('grep python | wc > out',shell=True)
Be aware that executing commands under the shell is a potential security risk if
arguments are derived from user input. The shlex.quote() function can be
used to properly quote arguments for inclusion in shell commands in this case.
The check_output() function is the easiest way to execute an external
command and get its output. However, if you need to perform more
advanced communication with a subprocess, such as sending it input, you’ll need
to take a difference approach. For that, use the subprocess.Popen class
directly. For example:
importsubprocess# Some text to sendtext=b'''hello worldthis is a testgoodbye'''# Launch a command with pipesp=subprocess.Popen(['wc'],stdout=subprocess.PIPE,stdin=subprocess.PIPE)# Send the data and get the outputstdout,stderr=p.communicate(text)# To interpret as text, decodeout=stdout.decode('utf-8')err=stderr.decode('utf-8')
The subprocess module is not suitable for communicating with external commands
that expect to interact with a proper TTY. For example, you can’t use it to
automate tasks that ask the user to enter a password (e.g., a ssh session).
For that, you would need to turn to a third-party module, such as those based
on the popular “expect” family of tools (e.g., pexpect or similar).
You need to copy or move files and directories around, but you don’t want to do it by calling out to shell commands.
The shutil module has portable implementations of functions for
copying files and directories. The usage is extremely
straightforward. For example:
importshutil# Copy src to dst. (cp src dst)shutil.copy(src,dst)# Copy files, but preserve metadata (cp -p src dst)shutil.copy2(src,dst)# Copy directory tree (cp -R src dst)shutil.copytree(src,dst)# Move src to dst (mv src dst)shutil.move(src,dst)
The arguments to these functions are all strings supplying file or directory names. The underlying semantics try to emulate that of similar Unix commands, as shown in the comments.
By default, symbolic links are followed by these commands. For
example, if the source file is a symbolic link, then the destination
file will be a copy of the file the link points to. If you want to copy
the symbolic link instead, supply the follow_symlinks keyword argument like this:
shutil.copy2(src,dst,follow_symlinks=False)
If you want to preserve symbolic links in copied directories, do this:
shutil.copytree(src,dst,symlinks=True)
The copytree() optionally allows you to ignore certain files and
directories during the copy process. To do this, you supply an
ignore function that takes a directory name and filename listing as
input, and returns a list of names to ignore as a result. For example:
defignore_pyc_files(dirname,filenames):return[nameinfilenamesifname.endswith('.pyc')]shutil.copytree(src,dst,ignore=ignore_pyc_files)
Since ignoring filename patterns is common, a utility function ignore_patterns() has already been
provided to do it. For example:
shutil.copytree(src,dst,ignore=shutil.ignore_patterns('*~','*.pyc'))
Using shutil to copy files and directories is mostly
straightforward. However, one caution concerning file metadata is
that functions such as copy2() only make a best effort in preserving
this data. Basic information, such as access times, creation times,
and permissions, will always be preserved, but preservation of owners,
ACLs, resource forks, and other extended file metadata may or may not
work depending on the underlying operating system and the user’s own
access permissions. You probably wouldn’t want to use a function like
shutil.copytree() to perform system backups.
When working with filenames, make sure you use the functions in
os.path for the greatest portability (especially if working with
both Unix and Windows). For example:
>>>filename='/Users/guido/programs/spam.py'>>>importos.path>>>os.path.basename(filename)'spam.py'>>>os.path.dirname(filename)'/Users/guido/programs'>>>os.path.split(filename)('/Users/guido/programs', 'spam.py')>>>os.path.join('/new/dir',os.path.basename(filename))'/new/dir/spam.py'>>>os.path.expanduser('~/guido/programs/spam.py')'/Users/guido/programs/spam.py'>>>
One tricky bit about copying directories with copytree() is the handling of errors.
For example, in the process of copying, the function might encounter
broken symbolic links, files that can’t be accessed due to permission
problems, and so on. To deal with this, all exceptions encountered are collected into
a list and grouped into a single exception that gets raised at the end of the operation.
Here is how you would handle it:
try:shutil.copytree(src,dst)exceptshutil.Errorase:forsrc,dst,msgine.args[0]:# src is source name# dst is destination name# msg is error message from exception(dst,src,msg)
If you supply the
ignore_dangling_symlinks=True keyword argument, then copytree() will ignore dangling symlinks.
The functions shown in this recipe are probably the most commonly used. However,
shutil has many more operations related to copying data. The documentation is
definitely worth a further look. See the Python documentation.
The shutil module has two functions—make_archive() and
unpack_archive()—that do exactly what you want. For example:
>>>importshutil>>>shutil.unpack_archive('Python-3.3.0.tgz')>>>shutil.make_archive('py33','zip','Python-3.3.0')'/Users/beazley/Downloads/py33.zip'>>>
The second argument to make_archive() is the desired output format.
To get a list of supported archive formats, use
get_archive_formats(). For example:
>>>shutil.get_archive_formats()[('bztar', "bzip2'ed tar-file"), ('gztar', "gzip'ed tar-file"),('tar', 'uncompressed tar file'), ('zip', 'ZIP file')]>>>
Python has other library modules for dealing with the low-level details of
various archive formats (e.g., tarfile, zipfile, gzip, bz2, etc.).
However, if all you’re trying to do is make or extract an archive, there’s
really no need to go so low level. You can just use these high-level
functions in shutil instead.
The functions have a variety of additional options for logging, dryruns,
file permissions, and so forth. Consult the shutil library documentation
for further details.
You need to write a script that involves finding files, like a file renaming script or a log archiver utility, but you’d rather not have to call shell utilities from within your Python script, or you want to provide specialized behavior not easily available by “shelling out.”
To search for files, use the os.walk() function, supplying it with the
top-level directory. Here is an example of a function that finds a specific
filename and prints out the full path of all matches:
#!/usr/bin/env python3.3importosdeffindfile(start,name):forrelpath,dirs,filesinos.walk(start):ifnameinfiles:full_path=os.path.join(start,relpath,name)(os.path.normpath(os.path.abspath(full_path)))if__name__=='__main__':findfile(sys.argv[1],sys.argv[2])
Save this script as findfile.py and run it from the command line, feeding in the starting point and the name as positional arguments, like this:
bash % ./findfile.py . myfile.txt
The os.walk() method traverses the directory hierarchy for us, and for
each directory it enters, it returns a 3-tuple, containing the relative path to
the directory it’s inspecting, a list containing all of the directory names in
that directory, and a list of filenames in that directory.
For each tuple, you simply check if the target filename is in the files list.
If it is, os.path.join() is used to put together a path. To avoid
the possibility of weird looking paths like ././foo//bar,
two additional functions are used to fix the result.
The first is os.path.abspath(), which takes a path that
might be relative and forms the absolute path, and the second is os.path.normpath(), which
will normalize the path, thereby resolving issues with double slashes, multiple
references to the current directory, and so on.
Although this script is pretty simple compared to the features of the
find utility found on UNIX platforms, it has the benefit of being
cross-platform. Furthermore, a lot of additional functionality can be
added in a portable manner without much more work. To
illustrate, here is a function that prints out all of the files that
have a recent modification time:
#!/usr/bin/env python3.3importosimporttimedefmodified_within(top,seconds):now=time.time()forpath,dirs,filesinos.walk(top):fornameinfiles:fullpath=os.path.join(path,name)ifos.path.exists(fullpath):mtime=os.path.getmtime(fullpath)ifmtime>(now-seconds):(fullpath)if__name__=='__main__':importsysiflen(sys.argv)!=3:('Usage: {} dir seconds'.format(sys.argv[0]))raiseSystemExit(1)modified_within(sys.argv[1],float(sys.argv[2]))
It wouldn’t take long for you to build far more complex operations on top of
this little function using various features of the os, os.path, glob, and
similar modules. See Recipes 5.11 and 5.13 for related recipes.
The configparser module can be used to read configuration files. For
example, suppose you have this configuration file:
; config.ini ; Sample configuration file
[installation] library=%(prefix)s/lib include=%(prefix)s/include bin=%(prefix)s/bin prefix=/usr/local
# Setting related to debug configuration [debug] log_errors=true show_warnings=False
[server]
port: 8080
nworkers: 32
pid-file=/tmp/spam.pid
root=/www/root
signature:
=================================
Brought to you by the Python Cookbook
=================================Here is an example of how to read it and extract values:
>>>fromconfigparserimportConfigParser>>>cfg=ConfigParser()>>>cfg.read('config.ini')['config.ini']>>>cfg.sections()['installation', 'debug', 'server']>>>cfg.get('installation','library')'/usr/local/lib'>>>cfg.getboolean('debug','log_errors')True>>>cfg.getint('server','port')8080>>>cfg.getint('server','nworkers')32>>>(cfg.get('server','signature'))=================================Brought to you by the Python Cookbook=================================>>>
If desired, you can also modify the configuration and write it
back to a file using the cfg.write() method. For example:
>>>cfg.set('server','port','9000')>>>cfg.set('debug','log_errors','False')>>>importsys>>>cfg.write(sys.stdout)[installation]library = %(prefix)s/libinclude = %(prefix)s/includebin = %(prefix)s/binprefix = /usr/local[debug]log_errors = Falseshow_warnings = False[server]port = 9000nworkers = 32pid-file = /tmp/spam.pidroot = /www/rootsignature ==================================Brought to you by the Python Cookbook=================================>>>
Configuration files are well suited as a human-readable format for specifying configuration data to your program. Within each config file, values are grouped into different sections (e.g., “installation,” “debug,” and “server,” in the example). Each section then specifies values for various variables in that section.
There are several notable differences between a config file and using a Python source file for the same purpose. First, the syntax is much more permissive and “sloppy.” For example, both of these assignments are equivalent:
prefix=/usr/local prefix: /usr/local
The names used in a config file are also assumed to be case-insensitive. For example:
>>>cfg.get('installation','PREFIX')'/usr/local'>>>cfg.get('installation','prefix')'/usr/local'>>>
When parsing values, methods such as getboolean() look for any reasonable
value. For example, these are all equivalent:
log_errors = true
log_errors = TRUE
log_errors = Yes
log_errors = 1Perhaps the most significant difference between a config file and Python code
is that, unlike scripts, configuration files are not executed in a top-down manner. Instead,
the file is read in its entirety. If variable substitutions are made, they
are done after the fact. For example, in this part of the config file, it
doesn’t matter that the prefix variable is assigned after other variables
that happen to use it:
[installation]
library=%(prefix)s/lib
include=%(prefix)s/include
bin=%(prefix)s/bin
prefix=/usr/localAn easily overlooked feature of ConfigParser is that it can
read multiple configuration files together and merge their results into
a single configuration. For example, suppose a user made their
own configuration file that looked like this:
; ~/.config.ini
[installation]
prefix=/Users/beazley/test
[debug]
log_errors=FalseThis file can be merged with the previous configuration by reading it separately. For example:
>>># Previously read configuration>>>cfg.get('installation','prefix')'/usr/local'>>># Merge in user-specific configuration>>>importos>>>cfg.read(os.path.expanduser('~/.config.ini'))['/Users/beazley/.config.ini']>>>cfg.get('installation','prefix')'/Users/beazley/test'>>>cfg.get('installation','library')'/Users/beazley/test/lib'>>>cfg.getboolean('debug','log_errors')False>>>
Observe how the override of the prefix variable affects other
related variables, such as the setting of library. This works
because variable interpolation is performed as late as possible. You
can see this by trying the following experiment:
>>>cfg.get('installation','library')'/Users/beazley/test/lib'>>>cfg.set('installation','prefix','/tmp/dir')>>>cfg.get('installation','library')'/tmp/dir/lib'>>>
Finally, it’s important to note that Python does not support the full range of
features you might find in an .ini file used by other programs (e.g.,
applications on Windows). Make sure you consult the configparser
documentation for the finer details of the syntax and supported features.
The easiest way to add logging to simple programs is to use the logging module. For example:
importloggingdefmain():# Configure the logging systemlogging.basicConfig(filename='app.log',level=logging.ERROR)# Variables (to make the calls that follow work)hostname='www.python.org'item='spam'filename='data.csv'mode='r'# Example logging calls (insert into your program)logging.critical('Host%sunknown',hostname)logging.error("Couldn't find%r",item)logging.warning('Feature is deprecated')logging.info('Opening file%r, mode=%r',filename,mode)logging.debug('Got here')if__name__=='__main__':main()
The five logging calls (critical(), error(), warning(), info(),
debug()) represent different severity levels in decreasing order. The
level argument to basicConfig() is a filter. All messages issued at
a level lower than this setting will be ignored.
The argument to each logging operation is a message string followed by
zero or more arguments. When making the final log message, the % operator
is used to format the message string using the supplied arguments.
If you run this program, the contents of the file app.log will be as follows:
CRITICAL:root:Host www.python.org unknown
ERROR:root:Could not find 'spam'If you want to change the output or level of output, you can change
the parameters to the basicConfig() call. For example:
logging.basicConfig(filename='app.log',level=logging.WARNING,format='%(levelname)s:%(asctime)s:%(message)s')
As a result, the output changes to the following:
CRITICAL:2012-11-20 12:27:13,595:Host www.python.org unknown
ERROR:2012-11-20 12:27:13,595:Could not find 'spam'
WARNING:2012-11-20 12:27:13,595:Feature is deprecatedAs shown, the logging configuration is hardcoded directly into the
program. If you want to configure it from a configuration file,
change the basicConfig() call to the following:
importloggingimportlogging.configdefmain():# Configure the logging systemlogging.config.fileConfig('logconfig.ini')...
Now make a configuration file logconfig.ini that looks like this:
[loggers]
keys=root
[handlers]
keys=defaultHandler
[formatters]
keys=defaultFormatter
[logger_root]
level=INFO
handlers=defaultHandler
qualname=root
[handler_defaultHandler]
class=FileHandler
formatter=defaultFormatter
args=('app.log', 'a')
[formatter_defaultFormatter]
format=%(levelname)s:%(name)s:%(message)sIf you want to make changes to the configuration, you can simply edit the logconfig.ini file as appropriate.
Ignoring for the moment that there are about a million advanced
configuration options for the logging module, this solution
is quite sufficient for simple programs and scripts. Simply make
sure that you execute the basicConfig() call prior to making
any logging calls, and your program will generate logging output.
If you want the logging messages to route to standard error instead
of a file, don’t supply any filename information to basicConfig().
For example, simply do this:
logging.basicConfig(level=logging.INFO)
One subtle aspect of basicConfig() is that it can only be called
once in your program. If you later need to change the configuration
of the logging module, you need to obtain the root logger and make
changes to it directly. For example:
logging.getLogger().level=logging.DEBUG
It must be emphasized that this recipe only shows a basic use of the
logging module. There are significantly more advanced customizations that
can be made. An excellent resource for such customization is the “Logging Cookbook”.
You would like to add a logging capability to a library, but don’t want it to interfere with programs that don’t use logging.
For libraries that want to perform logging, you should create a dedicated logger object, and initially configure it as follows:
# somelib.pyimportlogginglog=logging.getLogger(__name__)log.addHandler(logging.NullHandler())# Example function (for testing)deffunc():log.critical('A Critical Error!')log.debug('A debug message')
With this configuration, no logging will occur by default. For example:
>>>importsomelib>>>somelib.func()>>>
However, if the logging system gets configured, log messages will start to appear. For example:
>>>importlogging>>>logging.basicConfig()>>>somelib.func()CRITICAL:somelib:A Critical Error!>>>
Libraries present a special problem for logging, since information about the environment in which they are used isn’t known. As a general rule, you should never write library code that tries to configure the logging system on its own or which makes assumptions about an already existing logging configuration. Thus, you need to take great care to provide isolation.
The call to getLogger(__name__) creates a logger module that has the
same name as the calling module. Since all modules are unique, this
creates a dedicated logger that is likely to be separate from other
loggers.
The log.addHandler(logging.NullHandler()) operation attaches a null
handler to the just created logger object. A null handler
ignores all logging messages by default. Thus, if the library is used
and logging is never configured, no messages or warnings will appear.
One subtle feature of this recipe is that the logging of individual libraries can be independently configured, regardless of other logging settings. For example, consider the following code:
>>>importlogging>>>logging.basicConfig(level=logging.ERROR)>>>importsomelib>>>somelib.func()CRITICAL:somelib:A Critical Error!>>># Change the logging level for 'somelib' only>>>logging.getLogger('somelib').level=logging.DEBUG>>>somelib.func()CRITICAL:somelib:A Critical Error!DEBUG:somelib:A debug message>>>
Here, the root logger has been configured to only output messages
at the ERROR level or higher. However, the level of the
logger for somelib has been separately configured to output debugging
messages. That setting takes precedence over the global setting.
The ability to change the logging settings for a single module like this can be a useful debugging tool, since you don’t have to change any of the global logging settings—simply change the level for the one module where you want more output.
The “Logging HOWTO”
has more information about configuring the logging module and
other useful tips.
You want to be able to record the time it takes to perform various tasks.
The time module contains various functions for performing timing-related
functions. However, it’s often useful to put a higher-level interface
on them that mimics a stop watch. For example:
importtimeclassTimer:def__init__(self,func=time.perf_counter):self.elapsed=0.0self._func=funcself._start=Nonedefstart(self):ifself._startisnotNone:raiseRuntimeError('Already started')self._start=self._func()defstop(self):ifself._startisNone:raiseRuntimeError('Not started')end=self._func()self.elapsed+=end-self._startself._start=Nonedefreset(self):self.elapsed=0.0@propertydefrunning(self):returnself._startisnotNonedef__enter__(self):self.start()returnselfdef__exit__(self,*args):self.stop()
This class defines a timer that can be started, stopped, and reset as needed by
the user. It keeps track of the total elapsed time in the elapsed attribute. Here is an example that shows how it can be used:
defcountdown(n):whilen>0:n-=1# Use 1: Explicit start/stopt=Timer()t.start()countdown(1000000)t.stop()(t.elapsed)# Use 2: As a context managerwitht:countdown(1000000)(t.elapsed)withTimer()ast2:countdown(1000000)(t2.elapsed)
This recipe provides a simple yet very useful class for making timing
measurements and tracking elapsed time. It’s also a nice illustration
of how to support the context-management protocol and the with statement.
One issue in making timing measurements concerns the underlying time
function used to do it. As a general rule, the accuracy of timing
measurements made with functions such as time.time() or
time.clock() varies according to the operating system. In contrast,
the time.perf_counter() function always uses the highest-resolution timer
available on the system.
As shown, the time recorded by the Timer class is made according to wall-clock
time, and includes all time spent sleeping. If you only want the amount of CPU time
used by the process, use time.process_time() instead. For example:
t=Timer(time.process_time)witht:countdown(1000000)(t.elapsed)
Both the time.perf_counter() and time.process_time() return a “time” in
fractional seconds. However, the actual value of the time doesn’t have any
particular meaning. To make sense of the results, you have to call the
functions twice and compute a time difference.
More examples of timing and profiling are given in Recipe 14.13.
The resource module can be used to perform both tasks. For
example, to restrict CPU time, do the following:
importsignalimportresourceimportosdeftime_exceeded(signo,frame):("Time's up!")raiseSystemExit(1)defset_max_runtime(seconds):# Install the signal handler and set a resource limitsoft,hard=resource.getrlimit(resource.RLIMIT_CPU)resource.setrlimit(resource.RLIMIT_CPU,(seconds,hard))signal.signal(signal.SIGXCPU,time_exceeded)if__name__=='__main__':set_max_runtime(15)whileTrue:pass
When this runs, the SIGXCPU signal is generated when the time
expires. The program can then clean up and exit.
To restrict memory use, put a limit on the total address space in use. For example:
importresourcedeflimit_memory(maxsize):soft,hard=resource.getrlimit(resource.RLIMIT_AS)resource.setrlimit(resource.RLIMIT_AS,(maxsize,hard))
With a memory limit in place, programs will start generating
MemoryError exceptions when no more memory is available.
In this recipe, the setrlimit() function is used to set a soft and
hard limit on a particular resource. The soft limit is a value
upon which the operating system will typically restrict or notify the process
via a signal. The hard limit represents an upper bound on the values
that may be used for the soft limit. Typically, this is controlled by
a system-wide parameter set by the system administrator.
Although the hard limit can be lowered, it can never be raised by user
processes (even if the process lowered itself).
The setrlimit() function can additionally be used to set limits
on things such as the number of child processes, number of open
files, and similar system resources. Consult the documentation
for the resource module for further details.
Be aware that this recipe only works on Unix systems, and that it might not work on all of them. For example, when tested, it works on Linux but not on OS X.
You want to launch a browser from a script and have it point to some URL that you specify.
The webbrowser module can be used to launch a browser in a
platform-independent manner. For example:
>>>importwebbrowser>>>webbrowser.open('http://www.python.org')True>>>
This opens the requested page using the default browser. If you want a bit more control over how the page gets opened, you can use one of the following functions:
>>># Open the page in a new browser window>>>webbrowser.open_new('http://www.python.org')True>>>>>># Open the page in a new browser tab>>>webbrowser.open_new_tab('http://www.python.org')True>>>
These will try to open the page in a new browser window or tab, if possible and supported by the browser.
If you want to open a page in a specific browser, you can use the
webbrowser.get() function to specify a particular browser.
For example:
>>>c=webbrowser.get('firefox')>>>c.open('http://www.python.org')True>>>c.open_new_tab('http://docs.python.org')True>>>
A full list of supported browser names can be found in the Python documentation.
Being able to easily launch a browser can be a useful operation in
many scripts. For example, maybe a script performs some kind of
deployment to a server and you’d like to have it quickly launch a
browser so you can verify that it’s working. Or maybe a program
writes data out in the form of HTML pages and you’d just like to fire
up a browser to see the result. Either way, the webbrowser module
is a simple solution.