synesthesiam
Michael Hansen

Python Recipes

Posted: 2013-03-02 | More posts about python programming

From time to time, I come across or come up with interesting ways to solve problems in Python. To avoid forgetting them, I plan to update this post as I add more recipes to my collection.

If you know of a better way to do something, let me know!

Convert a string to '\n' newlines

When reading a file, Python can automatically convert line endings to \n (called universal newlines). What if you have a string, though, and want to ensure that it has the right line endings?

from io import StringIO

s_before = "I'm\r\nfrom\r\nWindows\r\n"
print list(s_before)
# ['I', "'", 'm', '\r', '\n', 'f', 'r', 'o', 'm', '\r', '\n', 'W', 'i', 'n', 'd', 'o', 'w', 's', '\r', '\n']

s_after = StringIO(unicode(s_before), newline=None).read()
print list(s_after)
# [u'I', u"'", u'm', u'\n', u'f', u'r', u'o', u'm', u'\n', u'W', u'i', u'n', u'd', u'o', u'w', u's', u'\n']

This uses the StringIO class from the io module (not the StringIO module), which requires you to convert your string to Unicode. The newline keyword argument controls how newline characters are translated (None enables universal newline mode).

Create a temporary file with a specific extension

Some libraries require files to have a specific extension, which can be a pain when your temporary file is named something like tmp4HknTk. NamedTemporaryFile to the rescue!

import tempfile
EXTENSION = ".txt"

with tempfile.NamedTemporaryFile(suffix=EXTENSION) as tmp:
    print "Temporary file path:", tmp.name
    tmp.write("Hello World\n")

# Temporary file is automatically deleted

Generate a random string with letters and numbers

Not terribly complicated, but this is a good example of elegance in Python. This function will generate a random string with (ASCII) uppercase/lowercase letters and numbers.

import random, string
def random_string(length):
    return "".join(random.sample(string.ascii_letters + string.digits, length))

print random_string(10)

Run an external command

The subprocess module has a lot of nice stuff for running external commands. The two functions I use all the time are check_call and check_output.

import subprocess

# Raises an exception if the command fails (non-zero exit code)
subprocess.check_call(["touch", "/path/to/file"])

result = subprocess.check_output(["ls", "-l", "-a"])
print result

I used this recently to import a SQL file into MySQL:

# mysql -u $user -p$password $db_name < $sql_path
with open(sql_path, "r") as sql_file:
    subprocess.check_call(["mysql", "-u", user, "-p{0}".format(password), db_name], stdin=sql_file)

Get all files in a directory by extension

When in doubt, glob it!

from glob import glob

# All Python files in the current directory
for path in glob("*.py"):
    print path

# All Python files one directory down
for path in glob("*/*.py"):
    print path

Get a file name (without extension) from a full path

If you have a full path but just want the file name (without the extension), use this function. There's probably already a function to do this, but I have yet to find it.

import os
def get_name(path):
    return os.path.splitext(os.path.split(path)[1])[0]

print get_name("/path/to/a/file.extension")  # file

Sort a list by values in a dictionary

Let's say you have a dictionary and you want to sort the keys by the values. Easy as pie!

d = { "a": 3, "b": 1, "c": 2 }
print sorted(d.keys())                      # a, b, c
print sorted(d.keys(), key=lambda k: d[k])  # b, c, a

Verify that a list contains no duplicates

x = [1, 7, 10, 3, 2, 5, 0, 20, 4]
assert len(x) == len(set(x))

Handle command-line arguments

Replace this:

import sys

if __name__ == "__main__":
    if len (sys.argv) < 3:
        print "Usage: my_script.py arg1 arg2 [--flag]"
        sys.exit(1)

    arg1 = int(sys.argv[1])
    arg2 = sys.argv[2]
    flag = True if "--flag" in sys.argv[1:] else False
    print arg1, arg2, flag

with this:

import argparse

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="My Awesome Script")
    parser.add_argument("arg1", type=int, help="The first argument")
    parser.add_argument("arg2", type=str, help="The second argument")
    parser.add_argument("--flag", action="store_true", help="An optional flag")
    args = parser.parse_args()

    print args.arg1, args.arg2, args.flag

With argparse, you get a lot for free: argument type checking, flags in any position (arg1 --flag arg2 works just as well as arg1 arg2 --flag), and a help screen (pass -h to see). Adding flags with short names and arguments is also a piece of cake:

parser.add_argument("-c", "--count", type=int, default=0, help="Count of something")
# ...
print args.count

Import a library from parent directory

Maybe it's evidence of a bad design, but on a few occasions I've needed to import a library script from one directory up. It's easy to import a script from a child directory, but a parent directory takes a little more work.

import sys, os

# Add parent directory to path    
sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), "..")))

# Now you can import scripts in the parent directory
from parent_script import magic_function

Read the first file in a zip archive

I have some large CSV files that I like to keep zipped up. Not all libraries support reading zipped CSV files, but it's easy enough to use the zipfile module as an intermediary.

import zipfile

def from_zip(path):
    archive = zipfile.ZipFile(path)
    return archive.open(archive.namelist()[0])

with from_zip("/path/to/file.zip") as my_file:
    for line in my_file:
        print line

Here's a little function to zip up a file and then delete the original:

import os, zipfile
def zip_delete(in_path, zip_path):
    with zipfile.ZipFile(zip_path, "w", zipfile.ZIP_DEFLATED) as new_zip:
        new_zip.write(in_path)
    os.unlink(in_path)

Recursively match a file name filter (e.g., *.mp3): :::python import fnmatch, os def globr(pattern, dir="."): for root, _, file_names in os.walk(dir): for f_name in file_names: yield os.path.join(root, f_name)

synesthesiamMichael Hansen