Here's an example of generating dynamic images using matplotlib in a web server (flask this time).
If it won't be simple, it simply won't be. [Hire me, source code] by Miki Tebeka, CEO, 353Solutions
Saturday, April 20, 2013
Thursday, April 11, 2013
Tuesday, April 09, 2013
Quickly Plotting Labeled Data
Here's a quick way to view some labeled data you have (taken from An Introduction to scikit-learn). It will reduce the data to two dimensions using PCA and then scatter plot it with different colors for each label.
Script to close a branch in mercurial (hg)
Mercurial (hg) does not let you delete branches (or alter history in any way). But you can close branches so they won't show in hg branches command.
Here's a script I use to close branches (we work with feature branches at work, and close them when work on the feature is done).
Here's a script I use to close branches (we work with feature branches at work, and close them when work on the feature is done).
Sunday, March 31, 2013
gittip on bitbucket/github
gittip is a cool idea, however currently there's no built-in way to add it to bitbucket/github projects.
One option I found that works it to add a clickable image to your README.md or README.rst.
See example here.
Markdown:
[](https://www.gittip.com/Miki%20Tebeka/)
ReStructuedText:
.. image:: http://i.imgur.com/lg9rx9w.png
:alt: gittip
:target: https://www.gittip.com/Miki%20Tebeka/
Notes:
One option I found that works it to add a clickable image to your README.md or README.rst.
See example here.
Markdown:
[](https://www.gittip.com/Miki%20Tebeka/)
ReStructuedText:
.. image:: http://i.imgur.com/lg9rx9w.png
:alt: gittip
:target: https://www.gittip.com/Miki%20Tebeka/
Notes:
- You'll probably want to change gittip user id :)
- There's a discussion on gittip bug tracker on the right way to do this.
- Unofficial gittip image generated using cooltext.
Thursday, March 28, 2013
import "C" slides
Last night I gave a talk about using C from Go at the L.A. Gophers meetup.
You can view the slides here. (Note that "run" won't work due to security restrictions, you can download the slides here and run it locally using the present tool).
You can view the slides here. (Note that "run" won't work due to security restrictions, you can download the slides here and run it locally using the present tool).
Wednesday, March 13, 2013
Investigating Hash Distribution
A college asked me for a hash function on strings that return an integer between 0 to N. Before diving in, I decided to take the lazy path and check if Python's hash function is good enough.
Luckily, ipython notebook --pylab=inline makes that a breeze.
Check out the notebook here.
And yes, we decided to stick with this solution. I guess we're at least 1/3 programmers.
Luckily, ipython notebook --pylab=inline makes that a breeze.
Check out the notebook here.
And yes, we decided to stick with this solution. I guess we're at least 1/3 programmers.
Friday, March 08, 2013
zipstream - Zip File InputFormat for Hadoop Streaming
At work, we store logs as a single CSV inside a zip file in HDFS (history, that's why :).
Looking around, I couldn't find a FileInput library that works with Hadoop streaming on CDH4 (the version we're using).
So I wrote one, hope you'll find it useful (you can download the jar directly from here.)
Here's an example how to use it:
Looking around, I couldn't find a FileInput library that works with Hadoop streaming on CDH4 (the version we're using).
So I wrote one, hope you'll find it useful (you can download the jar directly from here.)
Here's an example how to use it:
Thursday, February 21, 2013
Abusing namedtuple - Yet Another Enum
There's a discussion over at python-ideas about enum. This prompted me to write yet another implementation of enum, this time abusing namedtuple.
Friday, February 15, 2013
try lock
At work, we have several functions that can run only one at a time. We call this "try lock" (or trylock), and had it forever in the Java code.
When we started a Python project, we wanted this functionality. A decorator seems like the right solution. The below try_lock decorator has an optional function that lets you get a finer grained solution on what to lock. It gets the function arguments and returns a key to lock on. If you don't specify keyfn, then there will be just one lock for the function.
When we started a Python project, we wanted this functionality. A decorator seems like the right solution. The below try_lock decorator has an optional function that lets you get a finer grained solution on what to lock. It gets the function arguments and returns a key to lock on. If you don't specify keyfn, then there will be just one lock for the function.
Thursday, January 24, 2013
whoops - A WebHDFS Library and Client
Just released whoops 0.1.0 which is a WebHDFS library and a command line client for Python.
Wednesday, December 19, 2012
Timing Your Code
It's a good idea to time portions of your code and have some metric you monitor. This way you can see trends and solve bottlenecks before someone notices (hopefully).
Timing functions is easy with decorators, but sometimes you want to time a portion of a function. For this you can use a context manager.
Tuesday, December 11, 2012
fastavro got snappy support
fastavro 0.7.1 introduces snappy support! (also 3.3 egg)
Tuesday, November 20, 2012
Last Letter Frequency
I was playing a game with my child where you say a word, then the other person need to say a word which starts with the last letter of the word you said, then you need to say a word with their last letter ...
We noticed that many words end with S and E, which made me curious about the frequency of the last letter in English words. matplotlib makes it super easy to visualize the results.
We noticed that many words end with S and E, which made me curious about the frequency of the last letter in English words. matplotlib makes it super easy to visualize the results.
Friday, November 16, 2012
Python For Data Analysis
Just finished reading Python For Data Analysis, it's a great book with lots of practical examples. Highly recommended.
Thursday, October 25, 2012
Mocking HTTP Servers
Sometimes, httpbin is not enough, and you need your own custom HTTP server for testing.
Here's a small example on how to do that using the built in SimpleHTTPServer (thanks @noahsussman for reminding me).
Here's a small example on how to do that using the built in SimpleHTTPServer (thanks @noahsussman for reminding me).
Monday, October 15, 2012
http://httpbin.org
Sometimes you need to write an HTTP server to debug the client you are writing.
One quick way to avoid this is to use http://httpbin.org/. It supports most of the common HTTP verbs and mostly return the variables you send in.
For example (note the args field in the reply):
$ curl -i 'http://httpbin.org/get?x=1&y=2'
HTTP/1.1 200 OK
Content-Type: application/json
Date: Mon, 15 Oct 2012 21:50:27 GMT
Server: gunicorn/0.13.4
Content-Length: 386
Connection: keep-alive
{
"url": "http://httpbin.org/get?x=1&y=2",
"headers": {
"Content-Length": "",
"Connection": "keep-alive",
"Accept": "*/*",
"User-Agent": "curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3",
"Host": "httpbin.org",
"Content-Type": ""
},
"args": {
"y": "2",
"x": "1"
},
"origin": "75.82.8.111"
}
One quick way to avoid this is to use http://httpbin.org/. It supports most of the common HTTP verbs and mostly return the variables you send in.
For example (note the args field in the reply):
$ curl -i 'http://httpbin.org/get?x=1&y=2'
HTTP/1.1 200 OK
Content-Type: application/json
Date: Mon, 15 Oct 2012 21:50:27 GMT
Server: gunicorn/0.13.4
Content-Length: 386
Connection: keep-alive
{
"url": "http://httpbin.org/get?x=1&y=2",
"headers": {
"Content-Length": "",
"Connection": "keep-alive",
"Accept": "*/*",
"User-Agent": "curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3",
"Host": "httpbin.org",
"Content-Type": ""
},
"args": {
"y": "2",
"x": "1"
},
"origin": "75.82.8.111"
}
Friday, October 05, 2012
Cleanup After Your Tests - But Be Lazy
It's a nice practice to clean after your tests. It's good for various reasons like disk space, "pure" execution environment and others.
However if you clean up to eagerly it'll make your debugging much harder. The data just won't be there to see what went wrong.
The solution we found is pretty simple:
However if you clean up to eagerly it'll make your debugging much harder. The data just won't be there to see what went wrong.
The solution we found is pretty simple:
- Try to place all your test output in one location
- Nuke this location when starting the tests
Thursday, September 20, 2012
Data Wrangling With Python
I just gave a talk at work called "Data Wrangling With Python" which gives an overview on the scientific Python ecosystem.
You can view it here.
Friday, September 14, 2012
Using Hadoop Streaming With Avro
One of the way to use Python with Hadoop is via Hadoop Streaming. However it's geared mostly toward text based format and at work we use mostly Avro.
Took me a while to figure the magic, but here it is. Note that the input to the mapper is one JSON object per line.
Note it's a bit old (Avro is now at 1.7.4), originally from here.
Took me a while to figure the magic, but here it is. Note that the input to the mapper is one JSON object per line.
Note it's a bit old (Avro is now at 1.7.4), originally from here.
Subscribe to:
Comments (Atom)



