At one of my clients, we wanted something quick and dirty to prevent deletes from Elasticsearch (shield is too expensive and would take too much time to integrate with our systems - we'll fix this technical debt later).
The quick solution was to place HAProxy in front of Elasticsearch and use its acl mechanism to prevent HTTP DELETE. Works like a charm.
Here's the HAProxy configuration and the docker-compose setup file I used to test the configuration.
If it won't be simple, it simply won't be. [Hire me, source code] by Miki Tebeka, CEO, 353Solutions
Thursday, December 31, 2015
Tuesday, December 22, 2015
Python's deque for Go
Working on a Go project with my friend Fabrizio, I've investigated ways to have a faster data structure to store history items with append and pop.
Got the idea to try implementing Python's deque in Go. The C implementation is pretty easy to read. The result is deque for Go, which implement a subset of the features from Python's deque but enough for our needs. And it's pretty fast too:
Got the idea to try implementing Python's deque in Go. The C implementation is pretty easy to read. The result is deque for Go, which implement a subset of the features from Python's deque but enough for our needs. And it's pretty fast too:
$ make compare Git head is 765f6b0 cd compare && go test -run NONE -bench . -v testing: warning: no tests to run PASS BenchmarkHistAppend-4 3000000 517 ns/op BenchmarkHistList-4 2000000 702 ns/op BenchmarkHistQueue-4 3000000 576 ns/op BenchmarkHistDeque-4 3000000 423 ns/op ok _/home/miki/Projects/go/src/github.com/tebeka/deque/compare 8.505s
Wednesday, November 11, 2015
aenumerate - enumerate for async for
Python's new async/await syntax helps a lot with writing async code. Here's a little utility that provides the async equivalent of enumerate.
Labels:
python
Thursday, September 24, 2015
git - Creating Pull Request for master
A co-worker asked me for a code review (we're using Stash, but this can work for other systems as well), the problem was that he worked on master (started his own project) and not in development branch. The solution was to create an empty orphan branch and then a pull request from master to that branch (reverse the usual order).
Here's how to create such branch.
Here's how to create such branch.
Tuesday, September 01, 2015
Go Tour Exercise Solutions
As a backup plan for the last Go Meetup, I wrote the solutions to the exercises in Go Tour and we discussed some of them.
You can find the solutions here.
You can find the solutions here.
Monday, August 10, 2015
re2 available on conda
We're using re2 to get some speed gains on the many regular expressions we're trying to match. So far building it was either a manual step or a script that ran when building docker container. I decided to create a conda package (we're using Miniconda as our Python environment).
I started with conda skeleton pypi re2 (you need to conda install conda-build first). Then after some tweaking to build.sh we were good to go.
The result - you can now conda install -c tebeka re2 (only 64bit linux supported currently).
The project is here, I'll gladly accept any comments/improvements.
Here's build.sh which patches re2 Makefile and added the library and header location to the Python build step.
I started with conda skeleton pypi re2 (you need to conda install conda-build first). Then after some tweaking to build.sh we were good to go.
The result - you can now conda install -c tebeka re2 (only 64bit linux supported currently).
The project is here, I'll gladly accept any comments/improvements.
Here's build.sh which patches re2 Makefile and added the library and header location to the Python build step.
Tuesday, July 14, 2015
fastavro moved to github
If you can't beat them ... :)
fastavro is now on github. I still prefer mercurial as an SCM but most of the pull requests I get are on github and it doesn't worth the effort of maintaining two repositories (though hg-git is a big help)
fastavro is now on github. I still prefer mercurial as an SCM but most of the pull requests I get are on github and it doesn't worth the effort of maintaining two repositories (though hg-git is a big help)
Wednesday, July 08, 2015
dockermon - A Docker Event Monitor
I'm currently working with the awesome team at CyberInt (and yes, they are hiring).
We're moving to a docker based environment. The old environment used Supervisor to monitor and relaunch daemons. We had an event listener that notified us on our HipChat room every time a daemon crashed and wanted the same feature with our docker containers.
We didn't find a ready solution, so we wrote one and made it open source. The project is called dockermon and is one Python script with no external dependencies and also Python 2 and 3 compatible.
We're moving to a docker based environment. The old environment used Supervisor to monitor and relaunch daemons. We had an event listener that notified us on our HipChat room every time a daemon crashed and wanted the same feature with our docker containers.
We didn't find a ready solution, so we wrote one and made it open source. The project is called dockermon and is one Python script with no external dependencies and also Python 2 and 3 compatible.
Tuesday, June 30, 2015
Naming "with open" Variable
Python''s "with" statement is great for resource handling. However I find
my self struggling with naming (and naming
is important) the context manager variable.
When you write "with open(''/path/to/somethere'') as X", what''s the best name for X? In some cases it''s obvious, but in most cases I find myself using the generic "fo" (stands for "file object").
I decided to run a little script on Python''s 3.4 Lib directory and find out what is the most common name. Here are the results:
Seems like f is the most common, but I really don''t like single letter variables. I''ll go with the 2nd place - fp.
Here''s the script used to generate this chart:
When you write "with open(''/path/to/somethere'') as X", what''s the best name for X? In some cases it''s obvious, but in most cases I find myself using the generic "fo" (stands for "file object").
I decided to run a little script on Python''s 3.4 Lib directory and find out what is the most common name. Here are the results:
Seems like f is the most common, but I really don''t like single letter variables. I''ll go with the 2nd place - fp.
Here''s the script used to generate this chart:
Friday, June 26, 2015
353Solutions - A Year in Review
353Solutions was founded a bit more than a year ago. I wasn't planning on doing consulting, I'm a techie and love the development abstraction layer that companies give you and let you code most of the time. (If this is not the case in the company you're working at - consider finding a better one :)
However as the old saying goes - "Man plans and god laughs". I found myself owning a teaching/consulting company called 353Solutions. So far it's fun and provides for the family - what else can you ask for?
Here are results from a short retrospective we did lately.
However as the old saying goes - "Man plans and god laughs". I found myself owning a teaching/consulting company called 353Solutions. So far it's fun and provides for the family - what else can you ask for?
Here are results from a short retrospective we did lately.
The Numbers
- 6 clients
- 204 work days
- 204 hours teaching Python (7 courses)
Thoughts
I like working from home, however most companies I talked to wanted some office time. This is understandable since I don't only code but also do system and process design - these roles require more face to face communication. I'm still looking for something that will allow me to spend most of my time working from home.
Teaching is fun! I did that on and off most of my professional carrier, but now it's a big chunk of my time. I'm grateful to Raymond Hettinger who started me off and showed me what a top-notch class/workshop should look like. So far I'm mostly going to companies and teaching there, but just now we launched our own classes - it will be awesome!
The downside for teaching is that it takes me away from home. For a limited amount this is great (I spend about a week every month teaching Python in the UK). However I'm looking for opportunities that will let me teach from home - stay tuned.
The social network is by far my biggest source of new jobs. Talking to other people - it's not just me. Investing time in making connections and keeping them will pay off. The main downside for is that people want to hire me and not 353Solutions. This means I need to work harder to market the other people who I work with - I can't do everything.
Learning to say "no" was the hardest thing for me. So many interesting things to do, so many cool companies ... But I like spending time with my family, friends and hobbies. You need to find the things that make you happy and pay enough, going cheap is not a good thing in most cases. What I did in some cases was to take less money and get equity instead. Something like "technical" investing in startups.
The main point I need to improve is marketing. It's not something I like to do but feel the need, especially now that we have our own classes. I'm learning and looking for the best thing that will get maximal impact with minimal amount of time. Or maybe hire someone for that? If you know a good option - please let me know :)
Thursday, June 04, 2015
Use contextlib.closing to Handle "Legacy" Resources
Python''s context
managers (with statement)
are very handy at handling resources. (You see way less finally in Python code due to them).
Maybe objects in Python can be used as context mangers - files, locks, database
drivers and more. But some objects still do not.
To handle these "legacy" objects you can use contextlib.closing function which will return a context manager that will call obj.close() once the context manager exists.
Here''s an example of using contextlib.closing with sockets. We''ll be doing a simple HTTP request (Yeah, you should probably use requests or urlopen - this is just an example :)
Note also the user of iter with sentinel to read chunks up to 1K from the socket.
To handle these "legacy" objects you can use contextlib.closing function which will return a context manager that will call obj.close() once the context manager exists.
Here''s an example of using contextlib.closing with sockets. We''ll be doing a simple HTTP request (Yeah, you should probably use requests or urlopen - this is just an example :)
Note also the user of iter with sentinel to read chunks up to 1K from the socket.
Wednesday, May 06, 2015
Combining jQuery and Multi Components of React
React
is a great library for generating reactive web UI (and mobile). Reacts works well if there are isolated components
or a big one with hierarchy. However I wanted to have a page with several isolated
react components that are updated from the same data. The solution I found is
to use an observer
pattern and have each components register a callback to handle data change.
See the code below and a live demo here.
Note that I am a React newbie, if you know of a better way - please enlighten me.
Note that I am a React newbie, if you know of a better way - please enlighten me.
Tuesday, April 21, 2015
Solving Project Euler Problem 8 with numpy
I''m teaching a course in scientific Python these days. Usually I give
"homework" from project Euler (which
I personally use every time I learn a new programming language). I thought it''ll
be fun to solve the problem not just with Python but with what numpy has to offer as well.
Here''s an example solution for project Euler problem 8.
Here''s an example solution for project Euler problem 8.
Tuesday, April 07, 2015
Docker + MiniConda = A Perfect Match
Working with one of
my clients (who is hiring BTW), we decided to use Docker as deployment platform. Since many Linux systems
now use Python for many utilities, it''s advisable to install your own Python
next to the system one and use it.
Installing CPython from source requires some system packages, libraries, headers and some knowledge. The much easier path it to use MiniConda (from the wonderful people at Continuum). Not only the Python installation is super simple, but also the conda package manger will get you a lot of packages pre-compiled so you don''t have to install gcc and header files for C extensions. And if you can''t find the package you need with conda, pip is also available.
Here''s a little project to demonstrate how to do this. The application is an image server with has two entry points /edge for edge detection and /resize for image resizing. We''ll be using scikit-image and Pillow for image manipulation and Flask as web server. All of them can be conda installed.
Here''s the Dockerfile for the project. Build with docker build -tag imgsrv, Run with docker run -p 8080:8080 imgsrv (see Makefile).
Installing CPython from source requires some system packages, libraries, headers and some knowledge. The much easier path it to use MiniConda (from the wonderful people at Continuum). Not only the Python installation is super simple, but also the conda package manger will get you a lot of packages pre-compiled so you don''t have to install gcc and header files for C extensions. And if you can''t find the package you need with conda, pip is also available.
Here''s a little project to demonstrate how to do this. The application is an image server with has two entry points /edge for edge detection and /resize for image resizing. We''ll be using scikit-image and Pillow for image manipulation and Flask as web server. All of them can be conda installed.
Here''s the Dockerfile for the project. Build with docker build -tag imgsrv, Run with docker run -p 8080:8080 imgsrv (see Makefile).
Tuesday, February 24, 2015
Adding Server SSL Certificate on Linux
Here''s a small script to add a server SSL certificate on Linux. You can
export the certificate from your browser. Inspired by this stackoverflow
post.
Tuesday, February 03, 2015
Logging from Celery to logstash and a structured log (JSON)
I love Celery, and we''re
using it at one of my customers. One thing
we wanted to have is centralized logging, since you can have multiple workers
on multiple machines. We looked at several solutions and the winner came out to
be logstash + kibana
(AKA ELK stack).
Here''s some code to log to logstash with Celery current task information (if available) and also to a structured log (every line is a JSON object) for backup in case of network issues.
Here''s some code to log to logstash with Celery current task information (if available) and also to a structured log (every line is a JSON object) for backup in case of network issues.
Tuesday, January 27, 2015
Using supervisord to Manage You Daemons
Say you have some daemons running. You''d like to restart them automatically
if they fail, grab logs from them and in general manage them - supervisord
will do it all for you.
One of my (super cool) clients needed also to start/stop daemons when configuration changes. The solution was to have a script that updates supervisord.conf every time we have a configuration change and then selectively start/stop on the daemons that have change (by default if your run supervisorctl update, it will restart all the daemons).
For this example, I''ll assume that the daemon processes are python -m SimpleHTTPServer and I have a list of port I''d like to listen on. This list of ports might change.
One of my (super cool) clients needed also to start/stop daemons when configuration changes. The solution was to have a script that updates supervisord.conf every time we have a configuration change and then selectively start/stop on the daemons that have change (by default if your run supervisorctl update, it will restart all the daemons).
For this example, I''ll assume that the daemon processes are python -m SimpleHTTPServer and I have a list of port I''d like to listen on. This list of ports might change.
Example Usage
$ ./updated.py 8000 8001 8002 $ supervisorctl status httpd-8000 RUNNING pid 31768, uptime 0:00:04 httpd-8001 RUNNING pid 31767, uptime 0:00:04 httpd-8002 RUNNING pid 31766, uptime 0:00:04 $ ./updated.py 8000 8004 8002 # Remove 8001, add 8004 httpd-8001: disappeared httpd-8004: available httpd-8001: stopped httpd-8001: removed process group httpd-8004: added process group $ supervisorctl status httpd-8000 RUNNING pid 31768, uptime 0:00:12 httpd-8002 RUNNING pid 31766, uptime 0:00:12 httpd-8004 RUNNING pid 31785, uptime 0:00:02 $
Friday, January 09, 2015
python -m
python -m lets you run modules as scripts. If your module is just one .py file it'll be executed (which usually means code under if __name__ == '__main__'). If your module is a directory, Python will look for __main__.py (next to __init__.py) and will run it.
One of Python's mottoes is "batteries included", and this goes for python -m as well. Here are some (all?) of the gems hidden in the standard library. Sadly not all of them have help, but I poked around in the source code to see the usage.
One of Python's mottoes is "batteries included", and this goes for python -m as well. Here are some (all?) of the gems hidden in the standard library. Sadly not all of them have help, but I poked around in the source code to see the usage.
json.tool
This is by far the one I use most, it'll indent nicely an JSON input in the standard output and very helpful in combination with curl.
$ curl -sL http://j.mp/1IuxaLD
[{"x":1,"y":2},{"x":3,"y":4},{"x":5,"y":6}]
$ curl -sL http://j.mp/1IuxaLD | python -m json.tool
[
{
"x": 1,
"y": 2
},
{
"x": 3,
"y": 4
},
{
"x": 5,
"y": 6
}
]
$ curl -sL http://j.mp/1IuxaLD
[{"x":1,"y":2},{"x":3,"y":4},{"x":5,"y":6}]
$ curl -sL http://j.mp/1IuxaLD | python -m json.tool
[
{
"x": 1,
"y": 2
},
{
"x": 3,
"y": 4
},
{
"x": 5,
"y": 6
}
]
zipfile
zipfile will let you view, extract and create zip files - very much like the zip and unzip. Here's the help:
$ python -m zipfile -h
Usage:
zipfile.py -l zipfile.zip # Show listing of a zipfile
zipfile.py -t zipfile.zip # Test if a zipfile is valid
zipfile.py -e zipfile.zip target # Extract zipfile into target dir
zipfile.py -c zipfile.zip src ... # Create zipfile from sources
gzip
Like zipfile, let's you compress and decompress .gz files, like gzip/gunzip. By default it'll compress a file but with -d will decompress.
python -m gzip wordlist.txt # Will create wordlist.txt.gz
python -m gzip -d wordlist.txt.gz # Will extract to wordlist.txt
filecmp
Compare two directories.
$ python -m filecmp /tmp/a /tmp/b
diff /tmp/a /tmp/b
Only in /tmp/a : ['1']
Only in /tmp/b : ['2']
Identical files : ['4']
Differing files : ['3']
Encode/Decode
Several modules lets you encode/decode in various formats:
- base64
- uu
- encodings.rot_13
- binhex
- mimify
- quopri
For example
$ echo 'secertpassword' | python -m encodings.rot_13
frpregcnffjbeq
Servers
There are several servers that you can run, the ones I know are SimpleHTTPServer, CGIHTTPServer and smtpd (mail). If you quickly want to serve some files from a directory on your machine, just run:
python -m SimpleHTTPServer
Clients
Modules that provide simple clients to various protocols are:
- ftplib
- poplib
- nntplib
- smtplib (on localhost only)
- telnetlib
For example if you want to view Star Wars in text mode, do
$ python -m telnetlib towel.blinkenlights.nl
System Info
You can use platform to get some platform information (very much line uname -a) and locale to get locale information. Use mimetype to get the mime type of a file:
$ python -m mimetypes doc.html
type: text/html encoding: None
Python Utilties
- compileall will compile all Python files to .pyc
- dis will show bytecode for a file
- pdb will start the Python debugger on a given file (see here)
- pydoc will show documentation on a module/class/function
- site will print some site information (sys.path, USER_BASE ...)
- sysconfig will show many system related information (such as exec_prefix)
- tabnanny will tell you of you mix tabs and spaces (like starting python with -t or -tt)
- tokenize will print list of tokens in Python file
I mostly use pdb and pydoc, for example:
$ python -m pydoc os.remove
Help on built-in function remove in os:
os.remove = remove(...)
remove(path)
Remove a file (same as unlink(path)).
Profiling
There are several profiles and timers you can use from the command line:
- cProfile - Show profile information
- profile (use cProfile :)
- timeit - Time how long things run
- pstats - Print output of profiles
- trace - Show tracing information on run
Example:
$ python -m cProfile match.py
28537 function calls (27503 primitive calls) in 0.057 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 :1()
1 0.000 0.000 0.000 0.000 :1(ArgInfo)
1 0.000 0.000 0.000 0.000 :1(ArgSpec)
...
$ pyton -m timeit 'import math; math.factorial(100)'
100000 loops, best of 3: 12.9 usec per loop
timeit has good help from the command line.
IDLE
You can start IDLE by running python -m idlelib.idle
ensurepip
Python 2.7.9 and 3.x comes with an easy way to install pip. Run python -m ensurepip and pypi is at your service.
That's about it ... What are you favorite python -m tools? Which ones did I miss?
EDIT: The good folks at comp.lang.python reminded me a few I forgot:
That's about it ... What are you favorite python -m tools? Which ones did I miss?
EDIT: The good folks at comp.lang.python reminded me a few I forgot:
unittest
python -m unittest discover will run unittest in discovery mode. Just drop a new Python file starting with test and it'll be picked up next time you run the tests. You can also specify a specific test to run with python -m unittest test_file.py TestClass.test_method.
calendar
python -m calendar will show calendar of the current year. You can also run python -m calendar YEAR to display a specific year and python -m calendar YEAR MONTH to display a specific month.
Easter Eggs
python -m this will display the Python Zen
python -m antigravity will open XKCD comic web page (which my company is named after).
Subscribe to:
Posts (Atom)