What Makes Code Hard to Understand?
Posted: 2013-04-26

[arXiv paper] [eyeCode data set]

What factors impact the comprehensibility of code? In this blog post, I'll describe an experiment I did with my advisors Andrew Lumsdaine (Computer Science) and Rob Goldstone (Cognitive Science) at Indiana University.

We asked 162 programmers to predict the output of 10 small Python programs. Each program had 2 or 3 different versions, and we used subtle differences between program versions to demonstrate that seemingly insignificant notational changes can have big effects on correctness and response times. I'll go over some of the results here, hopefully to whet your appetite for the paper.

Read more...

Comments

An Introduction to Pandas
Posted: 2013-04-23

When dealing with numeric matrices and vectors in Python, NumPy makes life a lot easier. For more complex data, however, it leaves a lot to be desired. If you're used to working with data frames in R, doing data analysis directly with NumPy feels like a step back.

Fortunately, some nice folks have written the Python Data Analysis Library (a.k.a. pandas). Pandas provides an R-like DataFrame, produces high quality plots with matplotlib, and integrates nicely with other libraries that expect NumPy arrays.

In this tutorial, we'll go through the basics of pandas using a year's worth of weather data from Weather Underground. Pandas has a lot of functionality, so we'll only be able to cover a small fraction of what you can do. Check out the (very readable) pandas docs if you want to learn more.

Read more...

Comments

Python Recipes
Posted: 2013-03-02

From time to time, I come across or come up with interesting ways to solve problems in Python. To avoid forgetting them, I plan to update this post as I add more recipes to my collection.

If you know of a better way to do something, let me know!

Read more...

Comments

matplotlib and numpy: Double Trouble
Posted: 2012-05-19

[Code and Data]

For this tutorial, we'll be plotting some weather data from a site call Weather Underground. You can download temperature readings and weather events for your local area in a comma-separated file.

I've put weather data for Bloomington, IN in a file called weather.csv. Each row is one day, and there are columns for min/mean/max temperature, dew point, wind speed, etc. We'll be plotting temperature and weather event data (e.g., rain, snow).

Read more...

Comments

An Exercise with Functions and Plotting
Posted: 2012-05-11

[Code and Data]

Let's say you have a text file called workout.csv that contains information about your workouts for the month of March:

# date, kind of workout, distance (miles), time (min)
"2012, Mar-01", run, 2, 25
"2012, Mar-03", bike, 10, 55
"2012, Mar-06", bike, 5, 20
"2012, Mar-09", run, 3, 42
"2012, Mar-10", skateboarding, 2, 10

# Broke my leg :(

"2012, Mar-11", Wii, 0, 60
"2012, Mar-12", Wii, 0, 60
"2012, Mar-13", Wii, 0, 60
"2012, Mar-14", Wii, 0, 60

It's a common-separated value (CSV) file, but contains comments and blank lines. The first line (a comment) describes the fields in this file, which are (from left to right) the date of your workout, the kind of workout, how many miles you traveled, and how many minutes you spent (note: I didn't actually break my leg, it's just an example!).

Our goal will be to read this data into Python and plot a graph with the day of the month on the x-axis and the time worked out on the y-axis. Let's get started.

Read more...

Comments


    Contents © 2013 Michael Hansen - Powered by Nikola