If it won't be simple, it simply won't be. [Hire me, source code] by Miki Tebeka, CEO, 353Solutions

Friday, December 12, 2008


Get email notification whenever your program crashes.

Friday, November 14, 2008

The Code You Don't Write

Act without doing, work without effort.
Think of the small as large and the few as many.
Confront the difficult while it is still easy;
accomplish the great task by a series of small steps.
- Lao-Tze

Sometimes, the code you don't write is more important than the one you write.
Whenever I start on a new task, my first question is "how can I do this without coding?".

Here's a small (true) exmaple:

We had a problem that serving files on an NFS mounted volume was slow for the first request and them it was good. Probably the mount went "stale" after a while.

First option of learning the inner working of NFS mounts was dropped immediately - you can never know how much time this tinkering will take and if it'll work eventually.

So I decided to keep the NFS mount "warm" by periodically accessing it.

First Version:

from time import sleep
from os import listdir

while 1:
sleep(10 * 60 * 60)

and in crontab

@reboot /usr/bin/python /path/to/script.py

Then I thought - "cron", and second version came:
from os import listdir


and in crontab:

*/10 * * * * /usr/bin/python /path/to/script.py

And then last version (just crontab):

*/10 * * * * /bin/ls /path/to/nfs

So down from 6 LOC to 3 LOC to 1 LOC - that's what I call produtivity :)

Saturday, November 08, 2008

Where is Miki?

A little CGI script that show where I am (gets data from my google calendar).

Using Google Data Python API

#!/usr/bin/env python
'''Where where am I? (data from Google calendar)

Get gdata from http://code.google.com/p/gdata-python-client/

__author__ = "Miki Tebeka <miki.tebeka@gmail.com>"

import gdata.calendar.service as cal_service
from time import localtime, strptime, strftime, mktime, timezone

DAY = 24 * 60 * 60

def caltime_to_local(caltime):
# 2008-11-07T23:30:00.000+02:00
t = mktime(strptime(caltime[:16], "%Y-%m-%dT%H:%M"))
tz_h, tz_m = map(int, caltime[-5:].split(":"))
cal_tz = (tz_h * 60 * 60) + (tz_m * 60)
if caltime[-6] == "-":
cal_tz = -cal_tz

# See timezone documentation, the sign is reversed
diff = -timezone - cal_tz

return localtime(t + diff)

def iter_meetings():
client = cal_service.CalendarService()
client.email = "your-google-user-name"
client.password = "your-google-password"
client.source = "Where-is-Miki"

query = cal_service.CalendarEventQuery("default", "private", "full")
query.start_min = strftime("%Y-%m-%d")
tomorrow = localtime(mktime(localtime()) + DAY)
query.start_max = strftime("%Y-%m-%d", tomorrow)
feed = client.CalendarQuery(query)
for event in feed.entry:
title = event.title.text
when = event.when[0]
start = caltime_to_local(when.start_time)
end = caltime_to_local(when.end_time)

yield title, start, end

def find_meeting(meetings, now):
for title, start, end in meetings:
print title, start, end
if start <= now <= end:
return title, end

return None, None

def meetings_html(meetings):
if not meetings:
return "No meetings today"

trs = []
tr = "<tr><td>%s</td><td>%s</td><td>%s</td></tr>"
for title, start, end in meetings:
start = strftime("%H:%M", start)
end = strftime("%H:%M", end)
trs.append(tr % (title, start, end))

return "Today's meetings: <table border='1'>" + \
"<tr><th>Title</th><th>Start</th><th>End</th></tr>" + \
"\n".join(trs) + \

HTML = '''
<title>Where is Miki?</title>
body, td, th {
font-family: Monospace;
font-size: 22px;
<h1>Where is Miki?</h1>
Seems that he is <b>%s</b>.

if __name__ == "__main__":
import cgitb; cgitb.enable()
from operator import itemgetter

days = ["Mon","Tue","Wed", "Thu", "Fri", "Sat", "Sun"]
now = localtime()

day = days[now.tm_wday]
meetings = sorted(iter_meetings(), key=itemgetter(-1))

# Yeah, yeah - I get in early
if (now.tm_hour < 6) or (now.tm_hour > 17):
where = "at home"
elif day in ["Sat", "Sun"]:
where = "at home"
title, end = find_meeting(now, meetings)
if end:
where = "meeting %s (until %s)" % (title, strftime("%H:%M", end))
where = "at work"

print "Content-Type: text/html\n"
print HTML % (where, meetings_html(meetings))

Wednesday, November 05, 2008

Document With Examples

I've found out that a lot of times when I have a parsing code, it's best to document the methods with examples of the input.

A small example:
#!/usr/bin/env python
'''Simple script showing how to document with examples'''

__author__ = "Miki Tebeka <miki.tebeka@gmail.com>"

import re

# HTTP/1.1 200 OK
# HTTP/1.1 301 Moved Permanently
def http_code(line):
return line.split()[1]

if __name__ == "__main__":
print http_code("HTTP/1.1 301 Moved Permanently")

Monday, October 13, 2008

JavaScript sound player

It's very easy to connect Adobe FLEX to JavaScript.

We'll create a simple sound player that exposes two functions: play and stop. It'll also call the JavaScript function on_play_complete when the current sound has finished playing.


Compile with mxmlc soundplayer.mxml


Tuesday, September 23, 2008

"Disabling" an image

Sometimes you want to mark a button image as "disabled". The usual method is to have two images and display the "disabled" state image when disabled.

However you can use the image opacity the mark is as disabled as well:

.disabled {
filter: alpha(opacity=50);
-moz-opacity: 0.50;
opacity: 0.50;
<p>Show how to "dim" an image, marking it disabled</p>
<img src="image.png" id="image" /> <br />
<button onclick="disable();">Disable</button>
<button onclick="enable();">Enable</button>
<script src="jquery.js"></script>
function disable() {

function enable() {

Thursday, September 18, 2008

Destktop Web Application

It's very easy have the browser window as your user interface, even on desktop applications.

Below is a small phone book demo.

General Design:
  • Have a web server serve the main search page
  • Run an AJAX query and display the results
  • fork the server so the program will return

webphone.py (AKA the web server)

index.html (AKA the GUI)


Which version of module do I have?

#!/usr/bin/env python
'''Find python module version'''

__author__ = "Miki Tebeka <miki.tebeka@gmail.com>"

def valueof(v):
if callable(v):
return v()
except Exception:
return None
return v

def load_module(module_name):
module = __import__(module_name)

# __import__("a.b") will give us a
if ("." in module_name):
names = module_name.split(".")[1:]
while names:
name = names.pop(0)
module = getattr(module, name)

return module

def find_module_version(module_name):
module = load_module(module_name)
attrs = set(dir(module))

for known in ("__version__", "version"):
if known in attrs:
v = valueof(getattr(module, known))
if v:
return v

for attr in attrs:
if "version" in attr.lower():
v = getattr(module, attr)
if not v:
v = valueof(v)
if v:
return v

def main(argv=None):
if argv is None:
import sys
argv = sys.argv

from optparse import OptionParser

parser = OptionParser("usage: %prog MODULE_NAME")

opts, args = parser.parse_args(argv[1:])
if len(args) != 1:
parser.error("wrong number of arguments") # Will exit

module_name = args[0]

version = find_module_version(module_name)
except ImportError, e:
raise SystemExit("error: can't import %s (%s)" % (module_name, e))

if version:
print version
raise SystemExit("error: can't find version for %s" % module_name)

if __name__ == "__main__":

Tuesday, September 16, 2008

Exit Gracefully

When your program is terminated by a signal, the atexit handlers are not called.

A short solution:

Sunday, September 07, 2008

"unpack" updated

Updated the code to unpack and added view functionality to it.

Thursday, September 04, 2008


A "cross platform" command line utility to place things in the clipboard.
(On linux uses xsel)
#!/usr/bin/env python
'''Place stuff in clipboard - multi platform'''

__author__ = "Miki Tebeka <miki.tebeka@gmail.com>"

from os import popen
from sys import platform

COMMDANDS = { # platform -> command
"darwin" : "pbcopy",
"linux2" : "xsel -i",
"cygwin" : "/bin/putclip",

def putclip(text):
command = COMMDANDS[platform]
popen(command, "w").write(text)

def main(argv=None):
if argv is None:
import sys
argv = sys.argv

from optparse import OptionParser
from sys import stdin

parser = OptionParser("%prog [PATH]")

opts, args = parser.parse_args(argv[1:])

if len(args) not in (0, 1):
parser.error("wrong number of arguments") # Will exit

if platform not in COMMDANDS:
message = "error: don't know how to handle clipboard on %s" % platform
raise SystemExit(message)

if (not args) or (args[0] == "-"):
info = stdin
infile = args[0]
info = open(infile)
except IOError, e:
raise SystemExit("error: can't open %s - %s" % (infile, e))

except OSError, e:
raise SystemExit("error: %s" % e)

if __name__ == "__main__":

Friday, August 15, 2008


#!/usr/bin/env python

__author__ = "Miki Tebeka <miki.tebeka@gmail.com>"

def flatten(items):
'''Flatten a nested list.

>>> a = [[1], 2, [[[3]], 4]]
>>> list(flatten(a))
[1, 2, 3, 4]
for item in items:
if getattr(item, "__iter__", None):
for subitem in flatten(item):
yield subitem
yield item

if __name__ == "__main__":
from doctest import testmod


Thursday, August 07, 2008


'''Quick and dirty object "repr"'''

__author__ = "Miki Tebeka "
# FIXME: Find how to make doctest play with "regular" class definition

def printobj(obj):
Quick and dirty object "repr"

>>> class Point: pass
>>> p = Point()
>>> p.x, p.y = 1, 2
>>> printobj(p)
('y', 2)
('x', 1)
print "\n".join(map(str, obj.__dict__.items()))

if __name__ == "__main__":
from doctest import testmod

Thursday, July 24, 2008

CGI trampoline for cross site AJAX

Most (all?) browsers won't let you do cross site AJAX calls.

One solution is JSONP (which is supported by jQuery). However not all servers support it.

The other solution is to create a "trampoline" in your site that returns the data from the remote site:

Sunday, July 20, 2008


#!/usr/bin/env python
'''Find under which SCM directory is'''

__author__ = "Miki Tebeka <miki.tebeka@gmail.com>"

from os import sep
from os.path import join, isdir, abspath
from itertools import ifilter, imap

def updirs(path):
parts = path.split(sep)
if not parts[0]:
parts[0] = sep # FIXME: Windows
while parts:
yield join(*parts)

def scmdirs(path, scms):
for scmext in scms:
yield join(path, scmext)

def scm(dirname):
return dirname[-3:].lower()

def scms(path, scms):
return imap(scm, ifilter(isdir, scmdirs(path, scms)))

def whichscm(path):
path = abspath(path)

for scm in scms(path, (".svn", "CVS")):
return scm

scmdirs = (".bzr", ".hg", ".git")
for dirname in updirs(path):
for scm in scms(dirname, (".bzr", ".hg", ".git")):
return scm

def main(argv=None):
if argv is None:
import sys
argv = sys.argv

from optparse import OptionParser

parser = OptionParser("usage: %prog [DIRNAME]")

opts, args = parser.parse_args(argv[1:])
if len(args) not in (0, 1):
parser.error("wrong number of arguments") # Will exit

dirname = args[0] if args else "."
if not isdir(dirname):
raise SystemExit("error: %s is not a directory" % dirname)

scm = whichscm(dirname)
if not scm:
raise SystemExit("error: can't find scm for %s" % dirname)

print scm

if __name__ == "__main__":

Thursday, July 17, 2008


#!/usr/bin/env python
'''Find out who is listening on a port'''

from os import popen
from os.path import isdir
import re

is_int = re.compile("\d+").match

def find_pid(port):
for line in popen("netstat -nlp 2>&1"):
match = re.search(":(%s)\\s+" % port, line)
if not match:

pidname = line.split()[-1].strip()
return pidname.split("/")[0]

return None

def find_cmdline(pid):
cmd = open("/proc/%s/cmdline" % pid, "rb").read()

return " ".join(cmd.split(chr(0)))

def find_pwd(pid):
data = open("/proc/%s/environ" % pid, "rb").read()
for line in data.split(chr(0)):
if line.startswith("PWD"):
return line.split("=")[1]

return None

def main(argv=None):
if argv is None:
import sys
argv = sys.argv

from optparse import OptionParser

parser = OptionParser("usage: %prog PORT")
opts, args = parser.parse_args(argv[1:])
if len(args) != 1:
parser.error("wrong number of arguments") # Will exit

port = args[0]
pid = find_pid(port)
if not (pid and is_int(pid)):
raise SystemExit(
"error: can't find who listens on port %s"
" [try again with sudo?] " % port)

if not isdir("/proc/%s" % pid):
raise SystemExit("error: can't find information on pid %s" % pid)

pwd = find_pwd(pid) or "<unknown>"
print "%s (pid=%s, pwd=%s)" % (find_cmdline(pid), pid, pwd)

if __name__ == "__main__":

Note: This does not work on OSX (no /proc and different netstat api)

Monday, July 14, 2008

Code in googlecode

I'll post all the code shown here in http://pythonwise.googlecode.com/.

I've uploaded most of the code from 2008 to 2006, will add the other stuff bit by bit.

Friday, July 11, 2008

Computer Load - The AJAX Way

Show computer load using jquery, flot and Python's BaseHTTPServer (all is less that 70 lines of code).

#!/usr/bin/env python
'''Server to show computer load'''

import re
from os import popen
from BaseHTTPServer import HTTPServer, BaseHTTPRequestHandler
from socket import gethostname

def load():
'''Very fancy computer load :)'''
output = popen("uptime").read()
match = re.search("load average(s)?:\\s+(\\d+\\.\\d+)", output)
return float(match.groups()[1]) * 100

HTML = '''
<script src="jquery.js"></script>
<script src="jquery.flot.js"></script>
<title>%s load</title>
<h1>%s load</h1>
<div id="chart" style="width:600px;height:400px;">
Loading ...
var samples = [];
var options = {
yaxis: {
min: 0,
max: 100
xaxis: {
ticks: []

function get_data() {
$.getJSON("/data", function(data) {
if (samples.length > 120) {

var xy = [];
for (var i = 0; i < samples.length; ++i) {
xy.push([i, samples[i]]);
$.plot($('#chart'), [xy], options);

$(document).ready(function() {
setInterval(get_data, 1000);
''' % (gethostname(), gethostname())

class RequestHandler(BaseHTTPRequestHandler):
def do_GET(self):
if self.path == "/":
elif self.path.endswith(".js"):
self.wfile.write(open(".%s" % self.path).read())
self.wfile.write("%.2f" % load())

if __name__ == "__main__":
server = HTTPServer(("", 8888), RequestHandler)

Wednesday, July 09, 2008

bazaar is slow - who cares?

BS: git is faster than mercurial is faster than bazaar!
ME: Frankly dear, I don't give a damn.
BS: But speed is important!
ME: It is. However when you choose a source control system (if you have the
privilege of doing so), there are many more things to consider:
  • Does it fit my work model?
  • Is it stable?
  • Will it stay for long?
  • What's the community like?
  • Is development active?
  • ...
  • Is it fast enough?
BS: But bazaar is the slowest
ME: For many, many projects, it's fast enough
BS: So who too choose?
ME: You do your own math. I chose `bazaar` because it has two features that
the others (to my knowledge) don't have:
  • It knows about directories (I like to check-in empty logs directory - it simplifies the logging code)
  • You can check-in files from another directory (see here)
And, it's fast enough for me (about 1sec for bzr st on ~200K LOC)

BS: OK! ... but git storage is better than mercurial is better than bazaar!
ME: <sigh> Why do I even bother? </sigh> ...

Next week - LISP is faster than Python ;)

BS = blogosphere
ME = me

UPDATE (2009/01/27)
Bazaar slowness started to annoy me too much. I felt that every "bzr st" was taking way to much time (not to mention the updated). So I switched to mercurial.

The difference in timing is even noticeable in the most trivial operations:

[09:19] $time hg --version > /dev/null

real 0m0.060s
user 0m0.048s
sys 0m0.012s
[09:20] fattoc $time bzr --version > /dev/null

real 0m0.191s
user 0m0.144s
sys 0m0.048s
[09:21] $

You feel this 0.13 seconds. It seems that hg --version return immediately but bzr --version takes it's time.

Sometimes speed *does* matter.

Monday, June 23, 2008


SmokeJS is a discovery based unittest framework for JavaScript (like nose and py.test)

You can run tests either in the command line (with SpiderMonkey or Rhino) or in the browser.

Go, check it out and fill in bugs...

Friday, June 20, 2008

Better "start"

Updated start to support multiple desktop managers in Linux.

Thursday, June 05, 2008


Yet another enum implementation.

Thursday, May 29, 2008

Wednesday, May 21, 2008


Suppose you want to find the next n elements of a stream the matches a predicate.

(I just used it in web scraping with BeautifulSoup to get the next 5 sibling "tr" for a table).
#!/usr/bin/env python

from itertools import ifilter, islice

def next_n(items, pred, count):
return islice(ifilter(pred, items), count)

if __name__ == "__main__":
from gmpy import is_prime
from itertools import count
for prime in next_n(count(1), is_prime, 10):
print prime
Will print
(Using gmpy for is_prime)

Thursday, May 15, 2008


Generating tagcloud (using Mako here, but you can use any other templating system)



Fading Div

Add a new fading (background) div to your document:

function set_color(elem) {
var colorstr = elem.fade_color.toString(16).toUpperCase();
/* Pad to 6 digits */
while (colorstr.length < 6) {
colorstr = "0" + colorstr;
elem.style.background = "#" + colorstr;

function fade(elem, color) {
if (typeof(color) != "undefined") {
elem.fade_color = color;
else {
elem.fade_color += 0x001111;

if (elem.fade_color < 0xFFFFFF) {
setTimeout(function() { fade(elem); }, 200);

function initialize()
var div = document.createElement("div");
div.innerHTML = "I'm Fading";

fade(div, 0xFF0000); /* Red */

window.onload = initialize;

Wednesday, April 30, 2008

XML RPC File Server

#!/usr/bin/env python
'''Simple file client/server using XML RPC'''

from SimpleXMLRPCServer import SimpleXMLRPCServer
from xmlrpclib import ServerProxy, Error as XMLRPCError
import socket

def get_file(filename):
fo = open(filename, "rb")
try: # When will "with" be here?
return fo.read()

def main(argv=None):
if argv is None:
import sys
argv = sys.argv

default_port = "3030"
from optparse import OptionParser

parser = OptionParser("usage: %prog [options] [[HOST:]PORT]")
parser.add_option("--get", help="get file", dest="filename",
action="store", default="")

opts, args = parser.parse_args(argv[1:])
if len(args) not in (0, 1):
parser.error("wrong number of arguments") # Will exit

if args:
port = args[0]
port = default_port

if ":" in port:
host, port = port.split(":")
host = "localhost"

port = int(port)
except ValueError:
raise SystemExit("error: bad port - %s" % port)

if opts.filename:
proxy = ServerProxy("http://%s:%s" % (host, port))
print proxy.get_file(opts.filename)
raise SystemExit
except XMLRPCError, e:
error = "error: can't get %s (%s)" % (opts.filename, e.faultString)
raise SystemExit(error)
except socket.error, e:
raise SystemExit("error: can't connect (%s)" % e)

server = SimpleXMLRPCServer(("localhost", port))
print "Serving files on port %d" % port

if __name__ == "__main__":

This is a huge security hole, use at your own risk.

Friday, April 18, 2008


# Do the `./configure && make && sudo make install` dance, given a download URL

if [ $# -ne 1 ]; then
echo "usage: `basename $0` URL"
exit 1

set -e # Fail on errors


wget --no-check-certificate $url
archive=`basename $url`

if echo $archive | grep -q .tar.bz2; then
tar -xjf $archive
tar -xzf $archive

cd ${archive/.tar*}

if [ -f setup.py ]; then
sudo python setup.py install
./configure && make && sudo make install

cd ..

Tuesday, April 15, 2008

Registering URL clicks

Some sites (such as Google), gives you a "trampoline" URL so they can register what you have clicked on. I find it highly annoying since you can't tell where you are going just by hovering above the URL and you can't "copy link location" to a document.

The problem is that these people are just lazy:
       <a href="http://pythonwise.blogspot.com"
           onclick="jump(this, 1);">Pythonwise</a> knows.
   <script src="jquery.js"></script>
       function jump(url, value)
           $.post("jump.cgi", {
               url: url,
               value: value

           return true;

  • Using jQuery
  • "value" can be anything you want to identify this specific click. I'd use a UUID and some table for registering who is the user, what is the url, the time ...

Wednesday, April 09, 2008


# How many checking I did today?
# Without arguments will default to current directory

svn log -r"{`date +%Y%m%d`}:HEAD" $1 | grep "| $USER |" | wc -l

Thursday, April 03, 2008

FeedMe - A simple web-based RSS reader

A simple web-based RSS reader in less than 100 lines of code.

Using feedparser, jQuery and plain old CGI.


<title>FeedMe - A Minimal Web Based RSS Reader</title>
<link rel="stylesheet" type="text/css" href="feedme.css" />
<link rel="shortcut icon" href="feedme.ico" />
a {
text-decoration: none;
a:hover {
background-color: silver;
div.summary {
display: none;
position: absolute;
background: gray;
width: 70%;
font 18px monospace;
border: 1px solid black;
<h2>FeedMe - A Minimal Web Based RSS Reader</h2>
Feed URL: <input type="text" size="80" id="feed_url"/>
<button onclick="refresh_feed();">Load</button>
<hr />
<div id="items">
<script src="jquery.js"></script>
function refresh_feed() {
var url = $.trim($("#feed_url").val());
if ("" == url) {

$("#items").load("feed.cgi", {"url" : url});
/* Update every minute */
setTimeout("refresh_feed();", 1000 * 60);


#!/usr/bin/env python

import feedparser
from cgi import FieldStorage, escape
from time import ctime

<a href="%(link)s"
</a> <br />
<div class="summary" id="%(eid)s">

def main():
print "Content-type: text/html\n"

form = FieldStorage()
url = form.getvalue("url", "")
if not url:
raise SystemExit("error: not url given")

feed = feedparser.parse(url)
for enum, entry in enumerate(feed.entries):
entry.eid = "entry%d" % enum
html = ENTRY_TEMPLATE % entry
print html
except Exception, e:
# FIXME: Log errors

print "<br />%s" % ctime()

if __name__ == "__main__":

How it works:

  • The JavaScript script call loads the output of feed.cgi to the items div
  • feed.cgi reads the RSS feed from the given URL and output an HTML fragment
  • Hovering over a title will show the entry summary
  • setTimeout makes sure we refresh the view every minute

Wednesday, March 26, 2008


# Quickly serve files over HTTP

# Miki Tebeka <miki.tebeka@gmail.com>

usage="usage: `basename $0` PATH [PORT]"

if [ $# -ne 1 ] && [ $# -ne 2 ]; then
echo $usage >&2
exit 1

case $1 in
"-h" | "-H" | "--help" ) echo $usage; exit;;
* ) path=$1; port=$2;;

if [ ! -d $path ]; then
echo "error: $path is not a directory" >&2
exit 1

cd $path
python -m SimpleHTTPServer $port

Tuesday, March 18, 2008


def unique(items):
'''Remove duplicate items from a sequence, preserving order

>>> unique([1, 2, 3, 2, 1, 4, 2])
[1, 2, 3, 4]
>>> unique([2, 2, 2, 1, 1, 1])
[2, 1]
>>> unique([1, 2, 3, 4])
[1, 2, 3, 4]
>>> unique([])
seen = set()

def is_new(obj, seen=seen, add=seen.add):
if obj in seen:
return 0
return 1

return filter(is_new, items)

Thursday, February 21, 2008


OK, not Python - but sometime bash is a better tool.

# Extract audio from video files
# Uses ffmpeg and lame

# Miki Tebeka <miki.tebeka@gmail.com>

if [ $# -ne 2 ]; then
echo "usage: `basename $0` INPUT_VIDEO OUTPUT_MP3"
exit 1


if [ ! -f $infile ]; then
echo "error: can't find $infile"
exit 1

if [ -f $outfile ]; then
echo "error: $outfile exists"
exit 1

mkfifo $fifoname
mplayer -vc null -vo null -ao pcm:fast -ao pcm:file=$fifoname $1&
lame $fifoname $outfile
rm $fifoname

Wednesday, February 20, 2008


#!/usr/bin/env python
'''Path filter, to be used in pipes to filter out paths.

* Unix test commands (such as -f can be specified as well)
* {} replaces file name

# List only files in current directory
ls -a | pfilter -f

# Find files not versioned in svn
# (why, oh why, does svn *always* return 0?)
find . | pfilter 'test -n "`svn info {} 2>&1 | grep Not`"'

__author__ = "Miki Tebeka <miki.tebeka@gmail.com>"

from os import system

def pfilter(path, command):
'''Filter path according to command'''

if "{}" in command:
command = command.replace("{}", path)
command = "%s %s" % (command, path)

if command.startswith("-"):
command = "test %s" % command

# FIXME: win32 support
command += " 2>&1 > /dev/null"

return system(command) == 0

def main(argv=None):
if argv is None:
import sys
argv = sys.argv

from sys import stdin
from itertools import imap, ifilter
from string import strip
from functools import partial

if len(argv) != 2:
from os.path import basename
from sys import stderr
print >> stderr, "usage: %s COMMAND" % basename(argv[0])
print >> stderr
print >> stderr, __doc__
raise SystemExit(1)

command = argv[1]
# Don't you love functional programming?
for path in ifilter(partial(pfilter, command=command), imap(strip, stdin)):
print path

if __name__ == "__main__":

Tuesday, February 12, 2008

Opening File according to mime type

Most of the modern desktops already have a command line utility to open file according to their mime type (GNOME/gnome-open, OSX/open, Windows/start, XFCE/exo-open, KDE/kfmclient ...)

However, most (all?) of them rely on the file extension, where I needed something to view attachments from mutt. Which passes the file data in stdin.

So, here we go (I call this attview):

#!/usr/bin/env python
'''View attachment with right application'''

__author__ = "Miki Tebeka <miki.tebeka@gmail.com>"

from os import popen, system
from os.path import isfile
import re

class ViewError(Exception):

def view_attachment(data):
# In the .destop file, the file name is %u or %U
u_sub = re.compile("%u", re.I).sub

FILENAME = "/tmp/attview"
fo = open(FILENAME, "wb")

mime_type = popen("file -ib %s" % FILENAME).read().strip()
if ";" in mime_type:
mime_type = mime_type[:mime_type.find(";")]
if mime_type == "application/x-not-regular-file":
raise ViewError("can't guess mime type")

APPS_DIR = "/usr/share/applications"
for line in open("%s/defaults.list" % APPS_DIR):
if line.startswith(mime_type):
mime, appfile = line.strip().split("=")
raise ViewError("can't find how to open %s" % mime_type)

appfile = "%s/%s" % (APPS_DIR, appfile)
if not isfile(appfile):
raise ViewError("can't find %s" % appfile)
for line in open(appfile):
line = line.strip()
if line.startswith("Exec"):
key, cmd = line.split("=")
fullcmd = u_sub(FILENAME, cmd)
if fullcmd == cmd:
fullcmd += " %s" % FILENAME
system(fullcmd + "&")
raise ViewError("can't find Exec in %s" % appfile)

def main(argv=None):
from sys import stdin
if argv is None:
import sys
argv = sys.argv

from optparse import OptionParser

parser = OptionParser("usage: %prog [FILENAME]")

opts, args = parser.parse_args(argv[1:])
if len(args) not in (0, 1):
parser.error("wrong number of arguments") # Will exit

filename = args[0] if args else "-"

if filename == "-":
data = stdin.read()
data = open(filename, "rb").read()
except IOError, e:
raise SystemExit("error: %s" % e.strerror)

except ViewError, e:
raise SystemExit("error: %s" % e)

if __name__ == "__main__":

Thursday, February 07, 2008

Playing with bits

def mask(size):
'''Mask for `size' bits

>>> mask(3)
return (1L << size) - 1

def num2bits(num, width=32):
'''String represntation (in bits) of a number

>>> num2bits(3, 5)
s = ""
for bit in range(width - 1, -1, -1):
if num & (1L << bit):
s += "1"
s += "0"
return s

def get_bit(value, bit):
'''Get value of bit

>>> num2bits(5, 5)
>>> get_bit(5, 0)
>>> get_bit(5, 1)
return (value >> bit) & 1

def get_range(value, start, end):
'''Get range of bits

>>> num2bits(5, 5)
>>> get_range(5, 0, 1)
>>> get_range(5, 1, 2)
return (value >> start) & mask(end - start + 1)

def set_bit(num, bit, value):
'''Set bit `bit' in num to `value'

>>> i = 5
>>> set_bit(i, 1, 1)
>>> set_bit(i, 0, 0)
if value:
return num | (1L << bit)
return num & (~(1L << bit))

def sign_extend(num, size):
'''Sign exten number who is `size' bits wide

>>> sign_extend(5, 2)
>>> sign_extend(5, 3)
m = mask(size - 1)
res = num & m
# Positive
if (num & (1L << (size - 1))) == 0:
return res

# Negative, 2's complement
res = ~res
res &= m
res += 1
return -res

Wednesday, February 06, 2008

rotate and stretch

from operator import itemgetter
from itertools import imap, chain, repeat

def rotate(matrix):
'''Rotate matrix 90 degrees'''
def row(row_num):
return map(itemgetter(row_num), matrix)

return map(row, range(len(matrix[0])))

def stretch(items, times):
'''stretch([1,2], 3) -> [1,1,1,2,2,2]'''
return reduce(add, map(lambda item: [item] * times, items), [])

def istretch(items, count):
'''istretch([1,2], 3) -> [1,1,1,2,2,2] (generator)'''
return chain(*imap(lambda i: repeat(i, count), items))

Friday, February 01, 2008


Just found this on the web ...


#!/usr/bin/env python
# Find paths matching directories in subversion repository

__author__ = "Miki Tebeka <miki.tebeka@gmail.com>"

# * Limit search depth
# * Add option to case [in]sensitive
# * Handling of svn errors
# * Support more of "find" predicates (-type, -and, -mtime ...)
# * Another porject: Pre index (using swish-e ...) and update only from
# changelog

from os import popen

def join(path1, path2):
if not path1.endswith("/"):
path1 += "/"
return "%s%s" % (path1, path2)

def svn_walk(root):
command = "svn ls '%s'" % root
for path in popen(command):
path = join(root, path.strip())
yield path
if path.endswith("/"): # A directory
for subpath in svn_walk(path):
yield subpath

def main(argv=None):
if argv is None:
import sys
argv = sys.argv

import re
from itertools import ifilter
from optparse import OptionParser

parser = OptionParser("usage: %prog PATH EXPR")

opts, args = parser.parse_args(argv[1:])
if len(args) != 2:
parser.error("wrong number of arguments") # Will exit

path, expr = args
pred = re.compile(expr, re.I).search
except re.error:
raise SystemExit("error: bad search expression: %s" % expr)

found = 0
for path in ifilter(pred, svn_walk(path)):
found = 1
print path

if not found:
raise SystemError("error: nothing matched %s" % expr)

if __name__ == "__main__":

Friday, January 18, 2008

Simple Text Summarizer

  • About 50 lines of code
  • Gives reasonable results (try it out)
  • tokenize need to be improved much more (better detection, stop words ...)
  • split_to_sentences need to be improved much more (handle 3.2, Mr. Smith ...)
  • In real life you'll need to "clean" the text (Ads, credits, ...)

Tuesday, January 15, 2008

attrgetter is fast

#!/usr/bin/env python

from operator import attrgetter
from random import shuffle

class Point:
    def __init__(self, x, y):
        self.x, self.y = x, y

def sort1(points):
    points.sort(key = lambda p: p.x)

def sort2(points):
    points.sort(key = attrgetter("x"))

if __name__ == "__main__":
    from timeit import Timer

    points1 = [Point(x, 2 * x) for x in range(100)]
    points2 = points1[:]

    num_times = 10000

    t1 = Timer("sort1(points1)", "from __main__ import sort1, points1")
    print t1.timeit(num_times)

    t2 = Timer("sort2(points2)", "from __main__ import sort2, points2")
    print t2.timeit(num_times)

$ ./attr.py

Friday, January 04, 2008

Faster and Shorter "dot" using itertools

Let's calculate the dot product of two vectors:

from itertools import starmap, izip
from operator import mul

def dot1(v1, v2):
result = 0
for i, value in enumerate(v1):
result += value * v2[i]
return result

def dot2(v1, v2):
return sum(starmap(mul, izip(v1, v2)))

if __name__ == "__main__":
from timeit import Timer

num_times = 1000
v1 = range(100)
v2 = range(100)

t1 = Timer("dot1(%s, %s)" % (v1, v2), "from __main__ import dot1")
print t1.timeit(num_times) # 0.038722038269

t2 = Timer("dot2(%s, %s)" % (v1, v2), "from __main__ import dot2")
print t2.timeit(num_times) # 0.0260770320892
dot2 is faster and shorter, however dot1 is more readable - my vote goes to dot2.

Blog Archive