PythonWise: March 2007

Friday, March 30, 2007

HTML Entities

Quick way to see how are all HTML entities are displayed in your browser:


from urllib import urlopen
import re
import webbrowser

W3_URL = "http://www.w3.org/TR/WD-html40-970708/sgml/entities.html"
FILE_NAME = "/tmp/html-entities.html"
find_entity = re.compile("!ENTITY\s+([A-Za-z][A-Za-z0-9]+)").search

fo = open(FILE_NAME, "wt")

print >> fo, "<html><body><table border=\"1\">"

for line in urlopen(W3_URL):
    match = find_entity(line)
    if match:
        entity = match.groups()[0]
        print >> fo, "<tr><td>%s</td><td>&%s;</td></tr>" % (entity, entity)
print >> fo, "</table></body></html>"
fo.close()

webbrowser.open(FILE_NAME)

I'll make an exception for Kathy Sierra, and post a non-technical entry.

Just say "NO" to any violence in the internet, make it a better place for all of us.

Kathy, I hope you'll find the strength to overcome this.

Tuesday, March 27, 2007

Pushing Data - The Easy Way

One of the fastest ways to implement "pushing data to a server" is to have a CGI script on the server and push data to it from the clients.

This way you don't need to write a server, design a protocol, ... Just use an existing HTTP server (such as lighttpd) with CGI.

CGI Script:

#!/usr/bin/env python

from cgi import FieldStorage
from myapp import do_something_with_data

ERROR = "<html><body>Error: %s</body></html>"

def main():
   print "Content-Type: text/html"
   print

   form = FieldStorage()
   data = form.getvalue("data", "")
   key = form.getvalue("key", "").strip()
   if not (key and data):
       raise SystemExit(ERROR % "NO 'key' or 'data'")

   try:
       do_something_with_data(key, data)
   except Exception, e:
       raise SystemExit(ERROR % e)

   print "<html><body>OK</body></html>"

if __name__ == "__main__":
 main()

"Pushing" script:

#!/usr/bin/env python

from urllib import urlopen, urlencode

CGI_URL = "http://localhost:8080/load.cgi"
def push_data(key, data):
   query = urlencode([("data", data), ("key", key)])
   try:
       urlopen(CGI_URL, query).read()
   except IOError, e:
       pass # FIXME: Handle error

def main(argv=None):
   if argv is None:
       import sys
       argv = sys.argv

   from optparse import OptionParser
   from os.path import isfile, basename

   parser = OptionParser("usage: %prog FILENAME")

   opts, args = parser.parse_args(argv[1:])
   if len(args) != 1:
       parser.error("wrong number of arguments") # Will exit

   filename = args[0]
   if not isfile(filename):
       raise SystemExit("error: can't find %s" % filename)

   key = basename(filename)
   data = open(filename, "rb").read()

   push_data(key, data)


if __name__ == "__main__":
   main()

(Thanks to Martin for the idea)

Wednesday, March 21, 2007

`defaultdict`

Python 2.5 has a defaultdict dictionary in the collections
module.
defaultdict takes a factory function in the constructor. This function
will create the default value each time you try to get a missing item.

Then you can write a word histogram function like this:

from collections import defaultdict
def histogram(text):
   histogram = defaultdict(int) # int() -> 0
   for word in text.split():
       histogram[word] += 1
   return histogram

Or, if you want to store the location of the words as well

def histogram(text):
   histogram = defaultdict(list) # list() -> []
   for location, word in enumerate(text.split()):
       histogram[word].append(location)
   return histogram

PythonWise

Friday, March 30, 2007

HTML Entities

Say "NO" to Internet Violence

Tuesday, March 27, 2007

Pushing Data - The Easy Way

Wednesday, March 21, 2007

`defaultdict`

Blog Archive

About Me