sexta-feira, 30 de novembro de 2012

Backbone.js Tutorial: Build Environment

Backbone.js Tutorial: Build Environment:
This new Backbone.js tutorial series will walk you through building a single page web application that has a customised Backbone.sync implementation. I started building the application that these tutorials are based on back in August, and it’s been running smoothly for a few months now so I thought it was safe enough to unleash it.

Gmail to-do lists: not cool enough!
The application itself was built to solve a need of mine: a more usable Google Mail to-do list. The Gmail-based interface rubs me the wrong way to put it mildly, so I wrote a Backbone.sync method that works with Google’s APIs and stuck a little Bootstrap interface on top. As part of these tutorials I’ll also make a few suggestions on how to customise Bootstrap – there’s no excuse for releasing vanilla Bootstrap sites!

The app we’ll be making won’t feature everything that Google’s to-do lists support: I haven’t yet added support for indenting items for example. However, it serves my needs very well so hopefully it’ll be something you’ll actually want to use.

Roadmap


Over the next few weeks I’ll cover the following topics:

  • Creating a new Node project for building the single page app
  • Using RequireJS with Backbone.js
  • Google’s APIs
  • Writing and running tests
  • Creating the Backbone.js app itself
  • Techniques for customising Bootstrap
  • Deploying to Dropbox, Amazon S3, and potentially other services

Creating a Build Environment


If your focus is on client-side scripting, then I think this will be useful to you. Our goal is to create a development environment that can do the following:

  • Allow the client-side code to be written as separate files
  • Combine separate files into something suitable for deployment
  • Run the app locally using separate files (to make development and debugging easier)
  • Manage supporting Node modules
  • Run tests
  • Support Unix and Windows

To do this we’ll need a few tools and libraries:


Make sure your system has Node installed. The easiest way to install it is by using one of the Node packages for your system.

Step 1: Installing the Node Modules


Create a new directory for this project, and create a new file inside it called package.json that contains this JSON:
{
  "name": "btask"
, "version": "0.0.1"
, "private": true
, "dependencies": {
    "requirejs": "latest"
  , "connect": "2.7.0"
  }
, "devDependencies": {
    "mocha": "latest"
  , "chai": "latest"
  , "grunt": "latest"
  , "grunt-exec": "latest"
  }
, "scripts": {
    "grunt": "node_modules/.bin/grunt"
  }
}
Run npm install. These modules along with their dependencies will be installed in ./node_modules.

The private property prevents accidentally publicly releasing this module. This is useful for closed source commercial projects, or projects that aren’t suitable for release through npm.

Even if you’re not a server-side developer, managing dependencies with npm is useful because it makes it easier for other developers to work on your project. When a new developer joins your project, they can just type npm install instead of figuring out what downloads to grab.

Step 2: Local Web Server


Create a directory called app and a file called app/index.html:
<!DOCTYPE html>
<head>
  <meta charset="utf-8">
  <title>bTask</title>
  <script type="text/javascript" src="https://ajax.googleapis.com/ajax/libs/jquery/1.8.3/jquery.min.js"></script>
  <script type="text/javascript" src="js/lib/require.js"></script>
</head>
<body>
</body>
</html>
Once you’ve done that, create a file called server.js in the top-level directory:
var connect = require('connect')
  , http = require('http')
  , app
  ;

app = connect()
  .use(connect.static('app'))
  .use('/js/lib/', connect.static('node_modules/requirejs/'))
  .use('/node_modules', connect.static('node_modules'))
  ;

http.createServer(app).listen(8080, function() {
  console.log('Running on http://localhost:8080');
});
This file uses the Connect middleware framework to act as a small web server for serving the files in app/. You can add new paths to it by copying the .use(connect.static('app')) line and changing app to something else.

Notice how I’ve mapped the web path for /js/lib/ to node_modules/requirejs/ on the file system – rather than copying RequireJS to where the client-side scripts are stored we can map it using Connect. Later on the build scripts will copy node_modules/requirejs/require.js to build/js/lib so the index.html file won’t have to change. This will enable the project to run on a suitable web server, or a hosting service like Amazon S3 for static sites.

To run this Node server, type npm start (or node server.js) and visit http://localhost:8080. It should display an empty page with no client-side errors.

Step 3: Configuring RequireJS


This project will consist of modules written in the AMD format. Each Backbone collection, model, view, and so on will exist in its own file, with a list of dependencies so RequireJS can load them as needed.

RequireJS projects that work this way are usually structured around a “main” file that loads the necessary dependencies to boot up the app. Create a file called app/js/main.js that contains the following skeleton RequireJS config:
requirejs.config({
  baseUrl: 'js',

  paths: {
  },

  shim: {
  }
});

require(['app'],

function(App) {
  window.bTask = new App();
});
The part that reads require(['app'] will load app/js/app.js. Create this file with the following contents:
define([], function() {
  var App = function() {
  };

  App.prototype = {
  };

  return App;
});
This is a module written in the AMD format – the define function is provided by RequireJS and in future will contain all of the internal dependencies for the project.

To finish off this step, the main.js should be loaded. Add some suitable script tags near the bottom of app/index.html, before the </body> tag.
<script type="text/javascript" src="js/main.js"></script>
If you refresh http://localhost:8080 in your browser and open the JavaScript console, you should see that bTask has been instantiated.

window.bTask

Step 4: Testing


Everything you’ve learned in the previous three steps can be reused to create a unit testing suite. Mocha has already been installed by npm, so let’s create a suitable test harness.

Create a new directory called test/ that contains a file called index.html:
<html>
<head>
  <meta charset="utf-8">
  <title>bTask Tests</title>
  <link rel="stylesheet" href="/node_modules/mocha/mocha.css" />
  <style>
.toast-message, #main { display: none }
  </style>
</head>
<body>
  <div id="mocha"></div>
  <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.8.3/jquery.min.js"></script>
  <script src="/node_modules/chai/chai.js"></script>
  <script src="/node_modules/mocha/mocha.js"></script>
  <script src="/js/lib/require.js"></script>
  <script src="/js/main.js"></script>
  <script src="setup.js"></script>
  <script src="app.test.js"></script>
  <script>require(['app'], function() { mocha.run(); });</script>
</body>
</html>
The require near the end just makes sure mocha.run only runs when /js/app.js has been loaded.

Create another file called test/setup.js:
var assert = chai.assert;

mocha.setup({
  ui: 'tdd'
, globals: ['bTask']
});
This file makes Chai’s assertions available as assert, which is how I usually write my tests. I’ve also told Mocha that bTask is an expected global variable.

With all this in place we can write a quick test. This file goes in test/app.test.js:
suite('App', function() {
  test('Should be present', function() {
    assert.ok(window.bTask);
  });
});
All it does is checks window.bTask has been defined – it proves RequireJS has loaded the app.

Restart the web server (from step 2), and visit http://localhost:8080/test. Mocha should display that a single test has passed.

Step 5: Making Builds


Create a file called grunt.js for our “gruntfile”:
module.exports = function(grunt) {
  grunt.loadNpmTasks('grunt-exec');

  grunt.initConfig({
    exec: {
      build: {
        command: 'node node_modules/requirejs/bin/r.js -o require-config.js'
      }
    }
  });

  grunt.registerTask('copy-require', function() {
    grunt.file.mkdir('build/js/lib');
    grunt.file.copy('node_modules/requirejs/require.js', 'build/js/lib/require.js');
  });

  grunt.registerTask('default', 'exec copy-require');
};
This file uses the grunt-exec plugin by Jake Harding to run the RequireJS command that generates a build of everything in the app/ directory. To tell RequireJS what to build, create a file called require-config.js:
({
  appDir: 'app/'
, baseUrl: 'js'
, paths: {}
, dir: 'build/'
, modules: [{ name: 'main' }]
})
RequireJS will minimize and concatenate the necessary files. The other Grunt task copies the RequireJS client-side code to build/js/lib/require.js, because our local Connect server was mapping this for us. Why bother? Well, it means whenever we update RequireJS through npm the app and builds will automatically get the latest version.

To run Grunt, type npm run-script grunt. The npm command run-script is used to invoke scripts that have been added to the package.json file. The package.json created in step 1 contained "grunt": "node_modules/.bin/grunt", which makes this work. I prefer this to installing Grunt globally.

I wouldn’t usually use Grunt for my own projects because I prefer Makefiles. In fact, a Makefile for the above would be very simple. However, this makes things more awkward for Windows-based developers, so I’ve included Grunt in an effort to support Windows. Also, if you typically work as a client-side developer, you might find Grunt easier to understand than learning GNU Make or writing the equivalent Node code (Node has a good file system module).

Summary


In this tutorial you’ve created a Grunt and RequireJS build environment for Backbone.js projects that use Mocha for testing. You’ve also seen how to use Connect to provide a convenient local web server.

This is basically how I build and manage all of my Backbone.js single page web applications. Although we haven’t written much code yet, as you’ll see over the coming weeks this approach works well for using Backbone.js and RequireJS together.

The code for this project can be found here: dailyjs-backbone-tutorial (2a8517).

Is Python call-by-value or call-by-reference? Neither.

Is Python call-by-value or call-by-reference? Neither.:

One aspect of Python programming that trips up those coming from languages like C or Java is how arguments are passed to functions in Python. At a more fundamental level, the confusion arises from a misunderstanding about Python object-centric data model and its treatment of assignment. When asked whether Python function calling model is "call-by-value" or "call-by-reference", the correct answer is: neither. Indeed, to try to shoe-horn those terms into a conversation about Python's model is misguided. "call-by-object," or "call-by-object-reference" is a more accurate way of describing it. But what does "call-by-object" even mean?

In Python, (almost) everything is an object. What we commonly refer to as "variables" in Python are more properly called names. Likewise, "assignment" is really the binding of a name to an object. Each binding has a scope that defines its visibility, usually the block in which the name originates.
That's a lot of terminology all at once, but those basic terms form the cornerstone of Python's execution model. Compared to, say, C++, the differences are subtle yet important. A concrete example will highlight these differences. Think about what happens when the following C++ code is executed:
1
2
3
string some_guy = "Fred";
// ...
some_guy = "George";

In the above, the variable some_guy refers to a location in memory, and the value 'Fred' is inserted in that location (indeed, we can take the address of some_guy to determine the portion of memory to which it refers). Later, the contents of the memory location referred to by some_guy are changed to 'George'. The previous value no longer exists; it was overwritten. This likely matches your intuitive understanding (even if you don't program in C++).
Let's now look at a similar block of Python code:
1
2
3
some_guy = 'Fred'
# ...
some_guy = 'George'

Binding Names to Objects

On line 1, we create a binding between a name, some_guy, and a string object containing 'Fred'. In the context of program execution, the enviornment is altered; a binding of the name some_guy' to a string object is created in the scope of the block where the statement occurred. When we later say some_guy = 'George', the string object containing 'Fred' is unaffected. We've just changed the binding of the name some_guy. We haven't, however, changed either the 'Fred' or 'George' string objects. As far as we're concerned, they may live on indefinately.
With only a single name binding, this may seem overly pedantic, but it becomes more important when bindings are shared and function calls are involved. Let's say we have the following bit of Python code:
1
 2
 3
 4
 5
 6
 7
 8
 9
10
some_guy = 'Fred'

first_names = []
first_names.append(some_guy)

another_list_of_names = first_names
another_list_of_names.append('George')
some_guy = 'Bill'

print (some_guy, first_names, another_list_of_names)

So what get's printed in the final line? Well, to start, the binding of some_guy to the string object containing 'Fred' is added to the block's namespace. The name first_names is bound to an empty list object. On line 4, a method is called on the list object first_names is bound to, appending the object some_guy is bound to. At this point, there are still only two objects that exist: the string object and the list object. some_guy and first_names[0] both refer to the same object (Indeed, print(some_guy is first_names[0]) shows this).
Let's continue to break things down. On line 6, a new name is bound: another_list_of_names. Asignment between names does not create a new object. Rather, both names are simply bound to the same object. As a result, the string object and list object are still the only objects that have been created by the interpreter. On line 7, a member function is called on the object another_list_of_names is bound to and it is mutated to contain a reference to a new object: 'George'. So to answer our original question, the output of the code is
1
Bill ['Fred', 'George'] ['Fred', 'George']

This brings us to an important point: there are actually two kinds of objects in Python. A mutable object exhibits time-varying behavior. Changes to a mutable object are visible through all names bound to it. Python's lists are an example of mutable objects. An immutable object does not exhibit time-varying behavior. The value of immutable objects can not be modified after they are created. They can be used to compute the values of new objects, which is how a function like string.join works. When you think about it, this dichotomy is necessary because, again, everything is an object in Python. If integers were not immutable I could change the meaning of the number '2' throughout my program.
It would be incorrect to say that "mutable objects can change and immutable ones can't", however. Consider the following:
1
2
3
4
5
first_names = ['Fred', 'George', 'Bill']
last_names = ['Smith', 'Jones', 'Williams']
name_tuple = (first_names, last_names)

first_names.append('Igor')

Tuples in Python are immutable. We can't change the tuple object name_tuple is bound to. But immutable containers may contain references to mutable objects like lists. Therefore, even though name_tuple is immutable, it "changes" when 'Igor' is appended to first_names on the last line. It's a subtlety that can sometimes (though very infrequently) prove useful.
By now, you should almost be able to intuit how function calls work in Python. If I call foo(bar), I'm merely creating a binding within the scope of foo to the object the argument bar is bound to when the function is called. If bar refers to a mutable object and foo changes its value, then these changes will be visible outside of the scope of the function.
1
2
3
4
5
6
7
8
9
def foo(bar):
    bar.append(42)
    print(bar)
    # >> [42]

answer_list = []
foo(answer_list)
print(answer_list)
# >> [42]

On the other hand, if bar refers to an immutable object, the most that foo can do is create a name bar in its local namespace and bind it to some other object.
1
2
3
4
5
6
7
8
9
def foo(bar):
    bar = 'new value'
    print (bar)
    # >> 'new value'

answer_list = 'old value'
foo(answer_list)
print(answer_list)
# >> 'old value'

Hopefully by now it is clear why Python is neither "call-by-reference" nor "call-by-value". In Python a variable is not an alias for a location in memory. Rather, it is simply a binding to a Python object. While the notion of "everything is an object" is undoubtedly a cause of confusion for those new to the language, it allows for powerful and flexible language constructs, which I'll discuss in my next post.

Writing Idiomatic Python

Writing Idiomatic Python:

As someone who evangelizes Python at work, I read a lot of code written by
professional programmers new to Python. I've written a good amount of Python
code in my time, but I've certainly read far more. The single quickest way to
increase maintainability and decrease 'simple' bugs is to strive to write
idiomatic Python. Whereas some dynamic languages embrace the idea there being
no 'right' way to solve a problem, the Python community generally appreciates
the liberal use of 'Pythonic' solutions to problems. 'Pythonic' refers to the
principles laid out in 'The Zen of Python' (try typing 'import this' in an
interpreter...). One of those principles is
1
2
'There should be one-- and preferably only one --obvious way to do it'
                                                -from 'The Zen of Python' by Tim Peters

In that vein, I've begun compiling a list of Python idioms that programmers
coming from other languages may find helpful. I know there are a ton of things
not on here; it's merely a skeleton list that I'll add to over time. If you have
a specific idiom you think should be added, let me know in the comments and I'll
add it with attribution to the name you use in your comment.


This list will temporarily live here as a blog post, but I have an interesting
idea for its final home. More on that next week.

Update: 'Writing Idiomatic Python' e-Book Coming

See here for details!
Update 10/05/12: Add context managers, PEP8, itertools, string join(), dict.get()
default values

Idioms

Formatting

Python has a language-defined standard set of formatting rules known as PEP8. If you're browsing commit messages on Python projects, you'll likely find them littered with references to PEP8 cleanup. The reason is simple: if we all agree on a common set of naming and formatting conventions, Python code as a whole becomes instantly more accessible to both novice and experienced developers. PEP8 is perhaps the most explicit example of idioms within the Python community. Read the PEP, install a PEP8 style-checking plugin for your editor (they all have one), and start writing your code in a way that other Python developers will appreciate. Listed below are a few examples.
Identifier Type Format Example
Class Camel case class StringManipulator:
Variable Words joined by underscore words_joined_by_underscore = True
Function Words joined by underscore def are_words_joined_by_underscore(words):
'Internal' class members/functions Prefixed by single underscore def _update_statistics(self):
Unless wildly unreasonable, abbreviations should not be used (acronyms are fine if in common use, like 'HTTP')

Working With Data

Avoid using a temporary variable when swapping two variables

There is no reason to swap using a temporary variable in Python. We can use
tuples to make our intention more clear.
Harmful
1
2
3
temp = foo
foo = bar
bar = temp

Idiomatic
1
(foo, bar) = (bar, foo)

Use tuples to unpack data

In Python, it is possible to 'unpack' data for multiple assignment. Those familiar with LISP may know this as 'desctructuring bind'.
Harmful
1
2
3
4
list_from_comma_seperated_value_file = ['dog', 'Fido', 10] 
animal = list_from_comma_seperated_value_file[0]
name = list_from_comma_seperated_value_file[1]
age = list_from_comma_seperated_value_file[2]

Idiomatic
1
2
list_from_comma_seperated_value_file = ['dog', 'Fido', 10] 
(animal, name, age) = list_from_comma_seperated_value_file

Use ''.join when creating a single string for list elements

It's faster, uses less memory, and you'll see it everywhere anyway. Note that
the two quotes represent the delimiter between list elements in the string we're
creating.''just means we mean to concatenate the elements with no characters
between them.
Harmful
1
2
3
4
result_list = ['True', 'False', 'File not found']
result_string = ''
for result in result_list:
    result_string += result

Idiomatic
1
2
result_list = ['True', 'False', 'File not found']
result_string = ''.join(result_list)

Use the 'default' parameter of dict.get() to provide default values

Often overlooked in the get() definition is the default parameter. Without
using default (or the collections.defaultdict class), your code will be
littered with confusing if statements. Remember, strive for clarity.
Harmful
1
2
3
4
5
log_severity = None
if 'severity' in configuration:
    log_severity = configuration['severity']
else:
    log_severity = log.Info

Idiomatic
1
log_severity = configuration.get('severity', log.Info)

Use Context Managers to ensure resources are properly managed

Similar to the RAII principle in languages like C++ and D, context managers
(objects meant to be used with the with statement) can make resource
management both safer and more explicit. The canonical example is file IO.
Harmful
1
2
3
4
5
file_handle = open(path_to_file, 'r')
for line in file_handle.readlines():
    if some_function_that_throws_exceptions(line):
        # do something
file_handle.close()

Idiomatic
1
2
3
4
5
with open(path_to_file, 'r') as file_handle:
    for line in file_handle:
        if some_function_that_throws_exceptions(line):
            # do something
# No need to explicitly call 'close'. Handled by the File context manager

In the Harmful code above, what happens if some_function_that_throws_exceptions does, in fact, throw an exception? Since we haven't caught it in the code listed, it will propagate up the stack. We've hit an exit point in our code that might have been overlooked, and we now have no way to close the opened file. In addition to those in the standard libraries (for working with things like file IO, synchronization, managing mutable state) developers are free to create their own.

Learn the contents of the itertools module

If you frequent sites like StackOverflow, you may notice that the answer to questions of the form "Why doesn't Python have the following obviously useful library function?" almost always references the itertools module. The functional programming stalwarts that itertools provides should be seen as fundamental building blocks. What's more, the documentation for itertools has a 'Recipes' section that provides idiomatic implementations of common functional programming constructs, all created using the itertools module. For some reason, a vanishingly small number of Python developers seem to be aware of the 'Recipes' section and, indeed, the itertools module in general (hidden gems in the Python documentation is actually a recurring theme). Part of writing idiomatic code is knowing when you're reinventing the wheel.

Control Structures

If Statement

Avoid placing conditional branch on the same line as the colon

Using indentation to indicate scope (like you already do everywhere
else in Python) makes it easy to determine what will be executed as part of a
conditional statement.
Harmful
1
2
if name: print (name)
print address

Idiomatic
1
2
3
if name:
    print (name)
print address

Avoid having multiple statements on a single line

Though the language definition allows one to use a semi-colon to delineate
statements, doing so without reason makes one's code harder to read. Typically
violated with the previous rule.
Harmful
1
if this_is_bad_code: rewrite_code(); make_it_more_readable();

Idiomatic
1
2
3
if this_is_bad_code: 
    rewrite_code()
    make_it_more_readable()

Avoid repeating variable name in compound if Statement

When one wants to check against a number of values, repeatedly listing the
variable whose value is being checked is unnecessarily verbose. Using a temporary
collection makes the intention clear.
Harmful
1
2
if name == 'Tom' or name == 'Dick' or name == 'Harry':
    is_generic_name = True

Idiomatic
1
2
if name in ('Tom', 'Dick', 'Harry'):
    is_generic_name = True

Use list comprehensions to create lists that are subsets of existing data

List comprehensions, when used judiciously, increase clarity in code that
builds a list from existing data. Especially when data is both checked for some
condition and transformed in some way, list comprehensions make it clear
what's happening. There are also (usually) performance benefits to using list
comprehensions (or alternately, set comprehensions) due to optimizations in the
cPython interpreter.
Harmful
1
2
3
4
5
some_other_list = range(1, 100)
my_weird_list_of_numbers = list()
for element in some_other_list:
    if is_prime(element):
        my_weird_list_of_numbers.append(element+5)

Idiomatic
1
2
some_other_list = range(1, 100)
my_weird_list_of_numbers = [element + 5 for element in some_other_list if is_prime(element)]

Loops

Use the in keyword to iterate over an Iterable

Programmers coming languages lacking a for_each style construct are used to
iterating over a container by accessing elements via index. Python's in
keyword handles this gracefully.
Harmful
1
2
3
4
5
my_list = ['Larry', 'Moe', 'Curly']
index = 0
while index < len(my_list):
    print (my_list[index])
    index+=1

Idiomatic
1
2
3
my_list = ['Larry', 'Moe', 'Curly']
for element in my_list:
    print element

Use the enumerate function in loops instead of creating an 'index' variable

Programmers coming from other languages are used to explicitly declaring a
variable to track the index of a container in a loop. For example, in C++:
1
2
3
4
for (int i=0; i < container.size(); ++i)
{
    // Do stuff
}

In Python, the enumerate built-in function handles this role.
Harmful
1
2
3
4
index = 0
for element in my_container:
    print (index, element)
    index+=1

Idiomatic
1
2
for index, element in enumerate(my_container):
    print (index, element)

Django Production Deployment and Development Using Git

Django Production Deployment and Development Using Git:

When I started IllestRhyme, I had never before managed a web application. Much was similar to enterprise development. Much wasn't. One of the things I had no idea about was how to manage production deployment of a web app. I settled on some common Django trickery and Git, and it has worked like a charm.
I knew going in that I would use Git for source control. I wanted a distributed version control system to give me an opportunity to work anywhere git was installed. I didn't suspect I would use git for deployment, also.
When the site began, I didn't even have a "deployment" strategy. There were so few visitors to the site that I could work on it live. Within two weeks, it was clear I couldn't be showing users HTML 500 errors as frequently as I had been. I needed to start acting like I was working on a "real" project.
(Re)Enter Git. I would have a few mis-starts before I settled on a safe, productive way to work. Initially, I created a new directory on my machine for development. I cloned my git repository and created a dev branch. The dev branch had the same settings.py file as the master branch, and I was editing this manually as I switched between the dev and master branches. I knew this was a dangerous practice, and this proved true when I hosed the production database because of a bad settings file. Good thing I had DB backups...
There had to be a better way. I decided that, since the Django settings.py file was just Python, I would create a localsettings.py file that the settings.py file would import. For development, this would point to the development database and settings. For production, the production settings. This file is imported by the settings.py file and is not tracked by git (there's an entry for it in the .gitignore file).
Now I was free to work on my dev branch without worrying about messing up production. When I was happy with a change, it was merged with the master branch and pushed to bitbucket. Then the production area pulled down the changes and Apache was restarted. Perfect.
Something that took a bit to get used to using git was branching. In enterprise development with CVS or SVN, branches are more substantial "things" then personal development with git. A branch in git can both be created and deleted quickly. I frequently have five or six active branches of development for IllestRhyme: some for large sweeping changes that require database migrations, some just adding a few new pages/views to the site, some as small as correcting typos or adding a link or two.
Where git really shines is in switching between active branches. I can be working on a branch with 70 new files, say git checkout <somesmallbranch>, and everything is exactly as it should be, with the added files removed and the changes merged back to my small change branch. This allows me to work almost in real-time. If I'm deep in a change but get alerted to an error via email, I can quickly switch to my main dev branch, make a change, test it, commit it and pull it down in mere minutes.
Git has opened up a new world for me in terms of productivity. It's been so useful on IllestRhyme that I've begun to use it at my day job as an "out-of-band" VCS. I checkout with our enterprise VCS, do a quick git add . to my personal git repository, and branch/commit/merge to my heart's content. When I'm happy with my changes, I commit to my enterprise VCS, which has been instructed to ignore my .git directory.

Starting a Django 1.4 Project the Right Way

Starting a Django 1.4 Project the Right Way:

Back in February, I wrote a post titled 'Starting a Django Project the Right
Way'
,
which still draws a consistent audience eight months later. In those eight months,
Django has released version 1.4 of the framework, with 1.5 under active
development and promising experimental support for Python 3.x. Given these
changes, as well as the availability of new and updated resources available to
Django developers, I decided to revisit the concept of best practices when
starting a Django project.

The beginning of a project is a critical time. Choices are made that have long
term consequences. There are a number of tutorials about how to get started with
the Django framework, but few that discuss how to use Django in a professional
way, using industry accepted best practices to make sure your project development
scales as your application grows. A small bit of planning goes a long way
towards making your life easier in the future.
By the end of this post, you will have
  1. A fully functional Django 1.4 project
  2. All resources under source control (with git or Mercurial)
  3. Automated regression and unit testing (using the unittest library)
  4. An environment independent install of your project (using virtualenv)
  5. Automated deployment and testing (using Fabric)
  6. Automatic database migrations (using South)
  7. A development work flow that scales with your site.
None of these steps, except for perhaps the first, are covered in the
official tutorial. They should be. If you're looking to start a new,
production ready Django 1.4 project, look no further.


Prerequisites

A working knowledge of Python is assumed. Also, some prior experience
with Django would be incredibly helpful, but not strictly necessary.
You'll need git or Mercurial for version control. That's
it!

Preparing To Install

I'm assuming you have Python installed. If you don't head over to
python.org and find the install instructions
for your architecture/os. I'll be running on a 64-bit Ubuntu server installation hosted by Linode, with whom I'm very happy.
So, what's the first step? Install Django, right? Not quite. One common
problem with installing packages directly to your current site-packages
area is that, if you have more than one project or use Python on your
machine for things other than Django, you may run into dependency
issues between your applications and the installed packages. For this
reason, we'll be using
virtualenv and the excellent
extension virtualenvwrapper to manage our
Django installation. This is common, and recommended, practice among
Python and Django users.
If you're using pip to install packages (and I can't see why you wouldn't), you
can get both virtualenv and virtualenvwrapper by simply installing the latter.
1
$ pip install virtualenvwrapper

After it's installed, add the following lines to your shell's startup file
(.zshrc, .bashrc, .profile, etc).
1
2
3
export WORKON_HOME=$HOME/.virtualenvs
export PROJECT_HOME=$HOME/directory-you-do-development-in
source /usr/local/bin/virtualenvwrapper.sh

Reload your startup file (e.g. source .zshrc) and you're ready to go.

Creating a new environment

Creating a virtual environment is simple. Just type
1
$ mkvirtualenv django_project

where "django_project" is whatever name you give to your project.
You'll notice a few things happen right away:
Your shell is prepended by "(django_project)"
distribute and pip were automatically installed
This is an extremely helpful part of virtualenvwrapper: it automatically
prepares your environment in a way that lets you start installing packages using
pip right away. The "(django_project)" portion is a reminder that you're using a
virtualenv instead of your system's Python installation. To exit the virtual
environment, simply type deactivate. When you want to resume work on your
project, it's as easy as workon django_project. Note that unlike the vanilla
virtualenv tool, where you run these commands doesn't matter.

Installing Django

"Wait, 'Installing Django'? I already have Django installed!" Fantastic.
You aren't going to use it. Instead, we'll use one managed by virtualenv
that can't be messed up by other users (or yourself) working elsewhere
on the machine. To install Django under virtualenv, just type:
1
$ pip install django

This should give you the latest version of Django which will be installed in your
virtualenv area. You can confirm this by doing:
1
$ which django-admin.py

Which should point to your $HOME/.virtualenvs/ directory. If it doesn't,
make sure you see "(django_project)" before your prompt. If you don't, activate
the virtualenv using workon django_project.

Setting Up The Project

Before we actually start the project, we need to have a little talk. I've spoken
to a number of Django developers over the past few months and the ones having
the most difficulty are those that don't use a version control system. Many new
developers have simply never been exposed to version control. Others think that
since "this is a small project," that it's not necessary. Wrong.
None of the tools listed here will pay greater dividends then the use of
a version control system.

Previously, I only mentioned git as a (D)VCS. However, this project being in
Python, Mercurial is a worthy Python based alternative. Both are popular enough
that learning resources abound online. Make sure you have either git or
Mercurial installed. Both are almost certainly available via your distro's
packaging system.
If you plan on using git, GitHub is an obvious choice
for keeping a remote repository. With Mercurial, Atlassian's
Bitbucket is a fine choice (it supports git as well,
so you could use it in either case).

(source) Controlling Your Environment

Even though we haven't actually done anything yet, we know we're going to
want everything under source control. We have two types of 'things' we're going
to be committing: our code itself (including templates, etc) and supporting
files like database fixtures, South migrations (more on that later), and a
requirements file. In the old post, I recommended committing your actual virtualenv,
but there are a few good reasons not to, not the least of which is it's
unnecessary. Using a requirements file gives you all of the benefits with none
of the overhead.
Let's go ahead and create our project directory. Use the new startproject
command for django_admin.py to get it set up.
1
$ django_admin.py startproject django_project

We'll see a single directory created: django_project. Within the
django_project directory, we'll see another django_project directory
containing the usual suspects: settings.py, urls.py, and wsgi.py. At the same
level as the second django_project directory is manage.py.
Intermezzo: Projects vs. Apps
You may be wondering why the new startproject command was added alongside the
existing startapp command. The answer lies in the difference between
Django "projects" and Django "apps", which are clearly delineated in Django 1.4. Briefly,
a project is an entire web site or application. An "app" is a small,
(hopefully) self-contained Django application that can be used in any Django
project. If you're building a blogging application called "Super Blogger", then
"Super Blogger" is your Django project. If "Super Blogger" support reader polls,
the "polls" would be an Django app used by "Super Blogger". The idea is that
your polls app should be able to be reused in any Django project requiring user
polls, not just within "Super Blogger". A project is a collection of apps, along
with project specific logic. An app can be used in multiple projects.
While your natural inclination might be to include a lot of "Super Blogger"
specific code and information within your "polls" app, avoiding this has a
number of benefits. Based on the principle of loose coupling, writing your
apps as standalone entities prevents design decisions and bugs in your project
directly affecting your app. It also means that, if you wanted to, you could
pass of the development of any of your apps to another developer without them
needing to access or make changes to your main project.
Like many things in software development, it takes a bit of effort but pays huge
dividends later.

Setting up our repos

Since we have some "code" in our project now (really just some stock scripts and
empty config files, but bear with me), now is as good a time as any to
initialize our repositories in source control. Below are the steps to do this in
git and Mercurial.
git
1
$ git init

This creates a git repository in the current directory. Lets stage all of
our files to git to be committed.
1
$ git add django_project

Now we actually commit them to our new repo:
1
$ git commit -m 'Initial commit of django_project'

Mercurial
1
$ hg init

This creates a Mercurial repository in the current directory. Lets stage all of
our files to git to be committed.
1
$ hg add django_project

Now we actually commit them to our new repo:
1
$ hg commit -m 'Initial commit of django_project'

If you plan on using a service like GitHub or Bitbucket, now would be a
good time to push to them.

Using South for Database Migrations

One of the most frustrating aspects of Django is
managing changes to models and the associated changes to the database.
With the help of South, you can realistically create an entire
application without ever writing database specific code. Changes to your
models are detected and automatically made in the database through a
migration file that South creates. This lets you both migrate the
database forward for your new change and backwards to undo a change
or series of changes. It makes your life so much easier, it's a wonder
it's not included in the Django distribution (there has been some talk
of including a database migration tool in Django, but it hasn't happened
yet).
Still in our virtualenv, install South like so:
1
$ pip install south

We setup South by adding it to our INSTALLED_APS in the settings.py
file for the project. Add that now, as well as your database settings
for the project, then run python manage.py syncdb.
You'll be prompted for a superuser name and password (which you can go
ahead and enter). More importantly, South has setup the database with
the tables it needs.
You may have noticed that we just ran syncdb without having adding an app to the project. We do this first so that South is installed from the beginning. All migrations to our own apps will be done using South, including the initial migration.
Since we've just made some pretty big changes, now would be a good time
to commit. You should get used to committing frequently, as the
more granular the commit, the more freedom you have in choosing
something to revert to if things go wrong.
To commit, lets see what has changed.
(git)
1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
$ git status
# On branch master
# Changes not staged for commit:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
#       modified:   django_project/settings.py
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#       django_project/.settings.py.swp
#       django_project/__init__.pyc
#       django_project/settings.pyc

(Mercurial)
1
2
3
4
5
$ hg status
M django_project/django_project/settings.py
? django_project/django_project/.settings.py.swp
? django_project/django_project/__init__.pyc
? django_project/django_project/settings.pyc

With both git and Mercurial, you may notice files you don't ever want committed,
like the compiled Python .pyc files and vim swap .swp files above. To ignore
these files, create a .gitignore or .hgignore file in your root
project directory and add a shell pattern to match files you don't want
to be tracked. For example, the contents of my file might be
1
2
*.pyc
.*swp

Before we commit, we have one more piece of information to track: our installed
Python packages. We want to track the name and version of the Python packages
we're using so that we can seamlessly recreate our environment in our production
area. Helpfully, pip has a command that does exactly what we need.
1
$ pip freeze > requirements.txt

I piped the output to a file called requirements.txt, which we'll add to
source control so we always have an updated list of what packages are being used.
Let's stage and commit our settings.py and requirements.txt files to be committed by running
1
2
$ (git/hg) add django_project/settings.py requirements.txt
$ (git/hg) commit -m 'Added South for database migrations'

Creating Our App

Use manage.py to create an app in the normal way (python manage.py
startapp myapp
) and add it as an INSTALLED_APP. The first thing we'll do, before adding models, is
tell South we want to use it for migrations:
1
$ python manage.py schemamigration myapp --initial

This creates a migration file that can be used to apply our changes (if
we had any) and also revert them. We use the migration file to migrate the database changes (even though there are none)
using :
1
$ python manage.py migrate myapp

South is smart enough to know where to look for migration files, as well
as remember the last migration we did. You can specify
individual migration files, but it's usually not necessary.
When we eventually make changes to our model, we ask South to create a
migration using:
1
$ python manage.py schemamigration myapp --auto

This will inspect the models in myapp and automatically add, delete,
or modify the database tables accordingly. Changes can then be applied to the
database using the migrate command as above.

Our Development Area

One of the things you should get used to is doing development in an
area separate from where you're serving your files from, for obvious
reasons. git and Mercurial make this simple and also help with deployments.
Create a directory somewhere other than where django_project is installed
for your development area (I just call it dev).
In your development directory, clone the current project using git or Mercurial:
1
$ (git/hg) clone /path/to/my/project/

Both tools will create an exact copy of the entire repository. All changes,
branches, and history will be available here. From here on out, you
should be working from your development directory.
Since branching with both git and Mercurial is so easy and cheap, create branches
as you work on new, orthogonal changes to your site. Here's how to do it each tool:
(git)
1
$ git checkout -b <branchname>

Which will both create a new branch named and check it out.
Almost all of your development should be done on a branch, so that
master mimics the current production master and can be used for recovery at
any time.
(Mercurial)
1
$ hg branch <branchname>

Note that branching is kind of a contentious topic within the Mercurial
community, as there are a number of options available but no "obviously correct"
choice. Here, I use a named branch, which is probably the safest and most
informative style of branching. Any commits after the branch command are done on
the branch.

Using Fabric for Deployment

So we have the makings of a Django application. How do we deploy it?
Fabric. For a reasonable sized project, discussing anything else is a
waste of time. Fabric can be used for a number of purposes, but it really shines
in deployments.
1
$ pip install fabric

Fabric expects a fabfile named fabfile.py which defines all of the actions we
can take. Let's create that now. Put the following in fabfile.py in your project's root directory.
1
2
3
4
5
from fabric.api import local

    def prepare_deployment(branch_name):
        local('python manage.py test django_project')
        local('git add -p && git commit') # or local('hg add && hg commit')

This will run the tests and commit your changes, but only if your tests pass.
At this point, a simple "pull" in your production area
becomes your deployment. Lets add a bit more to actually deploy. Add
this to your fabfile.py:
1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
from fabric.api import lcd, local

    def deploy():
        with lcd('/path/to/my/prod/area/'):

            # With git...
            local('git pull /my/path/to/dev/area/')

            # With Mercurial...
            local('hg pull /my/path/to/dev/area/')
            local('hg update')

            # With both
            local('python manage.py migrate myapp')
            local('python manage.py test myapp')
            local('/my/command/to/restart/webserver')

This will pull your changes from the development master branch, run any
migrations you've made, run your tests, and restart your web server.
All in one simple command from the command line. If one of those steps
fails, the script stops and reports what happened. Once you fix the
issue, there is no need to run the steps manually. Since they're idempotent, you
can simply rerun the deploy command and all will be well.
Note that the code above assumes you're developing on the same machine you
deploy on. If that's not the case, the file would be mostly the same but would
use Fabric's run function instead of local. See the Fabric documentation for details.
So now that we have our fabfile.py created, how do we actually deploy?
Simple. Just run:
1
2
$ fab prepare_deployment
 $ fab deploy

Technically, these could be combined into a single command, but I find
it's better to explicitly prepare your deployment and then deploy as it
makes you focus a bit more on what you're doing.

Enjoy Your New Django Application

That's it! You're ready to start your actual development. If you do a
lot of Django development, just dump all of the commands above into a
fabfile and make creating a proper Django app a one step process. I have
one which I'll upload to my GitHub account later. If you have any
questions or corrections, or think there's a tool/step I missed, feel
free to email me at jeff@jeffknupp.com or leave
a comment below. Follow me on Twitter to get all of the latest blog updates!

quinta-feira, 29 de novembro de 2012

MySQL-python no Mac OS

MySQL-python no Mac OS:
MySQL-python é o módulo que o Python usa para se conectar a bancos MySQL. Tive um probleminha pra instalar ele aqui no Mac OS, mas é um bug conhecido, e já tem a solução no oráculo. Mas vou documentar aqui também.

Primeiro, baixar o pacote. E descompactar:

$ tar xvzf MySQL-python-1.2.2.tar.gz
$ cd MySQL-python-1.2.2

Se você não tiver o mysql_config no PATH, é preciso editar o setup-posix.py, onde tiver

mysql_config.path = "mysql_config"

mudar para o caminho completo

mysql_config.path = "/usr/local/mysql/bin/mysql_config"

Caso ainda retorne o erro:

In file included from /usr/local/mysql/include/mysql.h:47,
                 from _mysql.c:40:
/usr/include/sys/types.h:92: error: duplicate 'unsigned'
/usr/include/sys/types.h:92: error: two or more data types in declaration specifiers
error: command '/usr/bin/gcc-4.0' failed with exit status 1

Basta comentar, no arquivo _mysql.c, a linha

#define uint unsigned int

Agora é só instalar

$ sudo python setup.py install

mod_wsgi no OSX

mod_wsgi no OSX:
Finalmente consegui botar o mod_wsgi pra funcionar no OSX, usando o Python 2.6.
Tive problemas compilando na mão com o Apache que já vem instalado no sistema, então resolvi instalar tudo com macports.

Primeiro instalei o Python 2.5.2, depois o Python 2.6.1, deixando como default o 2.6.1. Meu medo era o comando python_select abaixo sobrescrever o python default do sistema (o /usr/bin/python), e dar problemas internos, mas o /usr/bin/python continua lá, com o 2.5.1 "mechido" pela Apple :-).

$ sudo port install python25
$ sudo port install python26
$ sudo python_select python26

Pronto, agora instalar o Apache, também pelo macports

$ sudo port install apache2
$ sudo launchctl load -w /Library/LaunchDaemons/org.macports.apache2.plist

Se você estava usando o Apache que já vem no sistema (como eu), não esqueça de desativar em System Preferences -> Sharing.

Agora pra instalar o mod_wsgi, eu alterei algumas coisas no Portfile, já que a versão que tava lá era antiga, e pedia o Python 2.4. Seguei abaixo meu Portfile:

/opt/local/var/macports/sources/rsync.macports.org/release/ports/www/mod_wsgi/Porfile

É isso, um script simples pra testar se tudo tá funcionando:

def application(environ, start_response):
    start_response('200 OK',  [('Content-type','text/plain')])
    return ['Ola mundo!\n']


Salve em /opt/local/apache2/htdocs/test.wsgi.
Edite seu /opt/local/apache2/conf/httpd.conf e adicione a linha:

WSGIScriptAlias /wsgi/test /opt/local/apache2/htdocs/test.wsgi

Basta reiniciar o Apache

$ sudo /opt/local/apache2/bin/apachectl restart

E acessar a url /localhost/wsgi/test

Tive problemas tentando compilar o mod_wsgi (com macports) com o python 2.6, e a versão padrão do sistema (com python_select) sendo o 2.5, dava uns erros com a função start_response. Mas agora tudo (aparentemente) funciona bem.

Setting up a Django production environment: compiling and configuring nginx

Setting up a Django production environment: compiling and configuring nginx: Here is another series of posts: now I’m going to write about setting up a Django production environment using nginx and Green Unicorn in a virtual environment. The subject in this first post is nginx, which is my favorite web server.

This post explains how to install nginx from sources, compiling it (on Linux). You might want to use apt, zif, yum or ports, but I prefer building from sources. So, to build from sources, make sure you have all development dependencies (C headers, including the PCRE library headers, nginx rewrite module uses it). If you want to build nginx with SSL support, keep in mind that you will need the libssl headers too.

Build nginx from source is a straightforward process: all you need to do is download it from the official site and build with some simple options. In our setup, we’re going to install nginx under /opt/nginx, and use it with the nginx system user. So, let’s download and extract the latest stable version (1.0.9) from nginx website:
% curl -O http://nginx.org/download/nginx-1.0.9.tar.gz
% tar -xzf nginx-1.0.9.tar.gz
Once you have extracted it, just configure, compile and install:
% ./configure --prefix=/opt/nginx --user=nginx --group=nginx
% make
% [sudo] make install
As you can see, we provided the /opt/nginx to configure, make sure the /opt directory exists. Also, make sure that there is a user and a group called nginx, if they don’t exist, add them:
% [sudo] adduser --system --no-create-home --disabled-login --disabled-password --group nginx
After that, you can start nginx using the command line below:
% [sudo] /opt/nginx/sbin/nginx
Linode provides an init script that uses start-stop-daemon, you might want to use it.

nginx configuration

nginx comes with a default nginx.conf file, let’s change it to reflect the following configuration requirements:
  • nginx should start workers with the nginx user
  • nginx should have two worker processes
  • the PID should be stored in the /opt/nginx/log/nginx.pid file
  • nginx must have an access log in /opt/nginx/logs/access.log
  • the configuration for the Django project we’re going to develop should be versioned with the entire code, so it must be included in the nginx.conf file (assume that the library project is in the directory /opt/projects).
So here is the nginx.conf for the requirements above:
user  nginx;
worker_processes  2;

pid logs/nginx.pid;

events {
    worker_connections  1024;
}

http {
    include       mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                     '$status $body_bytes_sent "$http_referer" '
                     '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  logs/access.log  main;

    sendfile           on;
    keepalive_timeout  65;

    include /opt/projects/showcase/nginx.conf;
}
Now we just need to write the configuration for our Django project. I’m using an old sample project written while I was working at Giran: the name is lojas giranianas, a nonsense portuguese joke with a famous brazilian store. It’s an unfinished showcase of products, it’s like an e-commerce project, but it can’t sell, so it’s just a product catalog. The code is available at Github. The nginx.conf file for the repository is here:
server {
    listen 80;
    server_name localhost;

    charset utf-8;

    location / {
        proxy_set_header    X-Real-IP   $remote_addr;
        proxy_set_header    Host        $http_host;
        proxy_set_header    X-Forwarded-For $proxy_add_x_forwarded_for;

        proxy_pass http://localhost:8000;
    }

    location /static {
        root /opt/projects/showcase/;
        expires 1d;
    }
}
The server listens on port 80, responds for the localhost hostname (read more about the Host header). The location /static directive says that nginx will serve the static files of the project. It also includes an expires directive for caching control. The location / directive makes a proxy_pass, forwarding all requisitions to an upstream server listening on port 8000, this server is the subject of the next post of the series: the Green Unicorn (gunicorn) server.

Not only the HTTP request itself is forwarded to the gunicorn server, but also some headers, that helps to properly deal with the request:
  • X-Real-IP: forwards the remote address to the upstream server, so it can know the real IP of the user. When nginx forwards the request to gunicorn, without this header, all gunicorn will know is that there is a request coming from localhost (or wherever the nginx server is), the remote address is always the IP address of the machine where nginx is running (who actually make the request to gunicorn)
  • Host: the Host header is forwarded so gunicorn can treat different requests for different hosts. Without this header, it will be impossible to Gunicorn to have these constraints
  • X-Forwarded-For: also known as XFF, this header provide more precise information about the real IP who makes the request. Imagine there are 10 proxies between the user machine and your webserver, the XFF header will all these proxies comma separated. In order to not turn a proxy into an anonymizer, it’s a good practice to always forward this header.
So that is it, in the next post we are going to install and run gunicorn. In other posts, we’ll see how to make automated deploys using Fabric, and some tricks on caching (using the proxy_cache directive and integrating Django, nginx and memcached).

See you in next posts.