Visualizing Climate Data – I – Using Python Modules

Note – I’m going to assume that you have a working knowledge and installation of the Python programming language. I’m starting with Python modules because in my experience, people coming from other languages may not be familiar with the ‘pluggability’ that Python gives to its users via these modules.

Whenever I persuade my friends to try Python, I introduce two initial arguments – it’s easy, and you can use modules. Most people immediately agree with the “it’s easy” part; Python reads like pseudo-code, and it’s usually trivial to translate a program from your brain to the computer screen. However, it usually takes them a while longer to see what is so special with modules. Novices tend to be familiar with the stock modules like math or os, and more advanced converts might experiment with re.

The real power of Python, though, comes from third-party modules. In a nutshell, it is incredibly easy for developers to package their custom-built Python utilities and distribute them to other coders – a search for ‘python’ on Google Code clearly illustrates this fact. You can find third-party modules for nearly any task – manipulating data, managing a web page, plotting data, or even posting to Twitter. Let’s use that last one as an example to explore how to install and use Python modules.

Python-twitter, a module by DeWitt Clinton, provides easy access to the Twitter API via Python. It’s easy to use – a few weeks ago I built some Twitter functionality into my personal website using it. Even if you don’t use Twitter, I’d recommend reading through this example to see how to install, access, and use generic Python modules.

Installing a Module

If you opted to download the tar from the Google Code home of Python-twitter, then you’ll need to enter ‘tar -xvzf xxx’ (where ‘xxx’ is the archives name) in a console to unzip it. If you downloaded the trunk via Subversion, you should start out with an unarchived directory.

Installing modules usually involves two steps – building and installation. Sometimes, you won’t need to build anything – a lot of modules are just libraries of pure Python code. Other times, modules might rely on C or FORTRAN code, and building involves setuptools analyzing your system for the appropriate compiler and building from the bundled source. Regardless, you’ll typically need to cd into the unzipped module directory and run two console commands:


python setup.py build

python setup.py install [--home=~/]

A python module is distributed with a ‘setup.py’ file which automates the installation process. Calling the first command here will build the source for your module and place it in a ‘build’ directory in the unzipped archive. Calling the second command (without the bracketed part) will copy it to your local Python install directory. This will probably be /usr/local/python2.x, but may not different depending on your own system. Often times, you’ll need to prefix the ‘install’ command with sudo and enter your superuser password. Doing this will install the module so that every user of the system has access to it.

Sometimes, though, you either won’t have write access to the Python installation directory or want to install a module just for your own use. In this case, you’ll want to use the bracketed part of the second command. Adding this argument will tell setup.py to install the module to your home directory (it’ll show up in ~/lib/python2.x). This is handy if you want to have a collection of your own modules separate from the base installation. Note that choosing this course of action requires a third step: you’ll need to modify your PYTHONPATH environment variable so that Python will be able to find your module. The easiest way to do this is to bring up a Python prompt and enter the following code:


>>> import sys

>>> sys.path.append('/home/daniel/lib/python2.6/site-packages/')

The only problem with updating your PYTHONPATH this way is that every time you update to a newer PYTHON, you’ll have to re-add all of your modules. To overcome this problem, I choose to edit PYTHONPATH via my ~/.bashrc file by adding the following line:


export PYTHONPATH=$PYTHONPATH:/home/daniel/lib/python2.6/site-packages

You’ll need to change the path to point to the directory holding your module in both of these methods.

Finally, we need to test that the module was correctly installed. Some modules provided automated tests you can run, but in general you should just go to a python prompt and enter:


>>> import module_name

If you get an error, verify that your PYTHONPATH correctly updated and that your module is actually in the installation directory. If nothing happens (you get another blank line), then you’re all set. If you’re following along by working with python-twitter, then replace ‘module_name’ with twitter and see if the installation worked.

Importing/Using Modules

At the end of the previous section, we imported the module to test if it installed correctly. Python doesn’t just go and add all of your modules for instant access whenever you want – doing so would waste resources. Instead, you need to explicitly tell Python when to fetch your custom utilities.

There are several ways to import your modules. Here’s the rundown, using the python-twitter module as an example:


>>> import twitter                      # A

>>> import twitter as twit-client   # B

>>> from twitter import Api         # C

>>> from twitter import *            # D

All four of these methods are valid but differ a bit in how you’ll use them, so let’s proceed with a specific task in mind. Let’s say that we want to post a tweet from our personal account. To do this, we need to instantiate the Api class provided by python-twitter with our username and password, and then call its PostUpdate() method. This boils down to two easy lines of code, which we’ll see shortly. The big question we need to answer for this task is – how will Python know where the Api class’ code is? The answer depends on your import statement.

Option A – ‘import twitter’

If we just import twitter, then we’ve basically imported a package of code into Python. We can access it just as we would access any other Python file on our current path – by calling it directly by its name. So, to post our tweet, we need to do the following:


>>> import twitter

>>> me = twitter.Api(username='me', password='mypassword')

>>> status = me.PostUpdate('This is my update')

>>> print status.text

This is my update

All of the code has been compartmentalized in the twitter module, so we just call the class’ constructor directly from there. This should be really straightforward.

Option B – ‘import twitter as twit-client’

Sometimes you’ll want to give your module a special name – especially if it’s a really long name that you don’t want to type out every single time you need to access it. With this import statement, we only need to change one thing for our code to work:


>>> import twitter as twit-client

>>> me = twit-client.Api(username='me', password='mypassword')

>>> status = me.PostUpdate('This is my update')

>>> print status.text

This is my update

As you can see, we just have to reference the correct module name so Python knows what to look for.

Option C – ‘from twitter import Api’

If you have a very large module, you might want to import a small subset of it. For instance, I do a great deal of reading netCDF files via Python and the SciPy package. SciPy is HUGE – it has tons of statistics and linear algebra functions that I just don’t need to use very often. So, I often times import just the io sub-module of SciPy to save memory and speed up my scripts. Again, the difference here will be marginal:


>>> from twitter import Api

>>> me = Api(username='me', password='mypassword')

>>> status = me.PostUpdate('This is my update')

>>> print status.text

This is my update

Notice that we don’t need to reference the twitter package anymore since we directly imported the Api class into our local environment.

Option D – ‘from twitter import *’

This command is considered un-pythonic because it’s not explicit as to what it does, although its fucntion is simple. Calling this import statement brings all the components of twitter into your local namespace so that you can call them as we did in Option C:


>>> from twitter import *

>>> me = Api(username='me', password='mypassword')

>>> status = me.PostUpdate('This is my update')

>>> print status.text

This is my update

The difference between this and C is that you get everything from the module, not just the Api class. I use this as a shortcut when I use Pylab, but in general, you should be explicit with your imports and just import exactly what you need.

Final Thoughts

This covers the basics behind using Python modules. As I said earlier, they’re REALLY easy to use, and you can find one for just about any job you’ll need to tackle. In the next article in this series, we’ll look at two numerical modules for Python – NumPy and SciPy – and see some example problems for which they’re well-suited to solve.

Advertisements

~ by counters on July 22, 2009.

One Response to “Visualizing Climate Data – I – Using Python Modules”

  1. […] Using Python Modules […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: