If it won't be simple, it simply won't be. [Hire me, source code] by Miki Tebeka, CEO, 353Solutions

Monday, December 26, 2016

Automatically Running BigQuery Flows

At a current project we're using Google's BigQuery to crunch some petabyte scale data. We have several SQL scripts that we need to run in specific order. The below script detects the table dependencies and run the SQL scripts in order. As a bonus you can run it with --view and it'll show you the dependency graph.

Friday, November 25, 2016

Using built-in slice for indexing

At one place I consult I saw something like the following code:
This is good and valid Python code, however we can use the slice built-in slice to do the same job.

Also when you're writing your own __getitem__ consider that key might be a slice object.

Saturday, November 12, 2016

Friday, September 16, 2016

Simple Object Pools

Sometimes we need object pools to limit the number of resource consumed. The most common example is database connnections.

In Go we sometime use a buffered channel as a simple object pool.

In Python, we can dome something similar with a Queue. Python's context manager makes the resource handing automatic so clients don't need to remember to return the object.


Here's the output of both programs:


$ go run pool.go
worker 7 got resource 0
worker 0 got resource 2
worker 3 got resource 1
worker 8 got resource 2
worker 1 got resource 0
worker 9 got resource 1
worker 5 got resource 1
worker 4 got resource 0
worker 2 got resource 2
worker 6 got resource 1

$ python pool.py
worker 5 got resource 1
worker 8 got resource 2
worker 1 got resource 3
worker 4 got resource 1
worker 0 got resource 2
worker 7 got resource 3
worker 6 got resource 1
worker 3 got resource 2
worker 9 got resource 3
worker 2 got resource 1

Tuesday, August 30, 2016

"Manual" Breakpoints in Go

When debugging, sometimes you need to set conditional breakpoints. This option is available both in gdb and delve. However sometimes when the condition is complicated, it's hard or even impossible to set it. A way around is to temporary write the condition in Go and set breakpoint "manually".

I Python we do it with pdb.set_trace(), in Go we'll need to work a little harder. The main idea is that breakpoints are special signal called SIGTRAP.

Here's the code to do this:
You'll need tell the go tool not to optimize and keep variable information:

$ go build -gcflags "-N -l" manual-bp

Then run a gdb session

$ gdb manual-bp 
(gdb) run 

 When you hit the breakpoint, you'll be in assembly code. Exit two functions to get to your code

(gdb) fin
(gdb) fin

Then you'll be in your code and can run gdb commands

(gdb) p i
$1 = 3

This scheme also works with delve

$ dlv debug manual-bp.go 
(dlv) c 

Sadly delve don't have "fin" command so you'll need to hit "n" (next) until you reach your code. 

That's it, happy debugging.

Oh - and in the very old days we did about the same trick in C code. There we manually inserted asm("int $3)" to the code. You can do with with cgo but sending a signal seems easier.

Wednesday, August 24, 2016

Generate Relation Diagram from GAE ndb Model

Working with GAE, we wanted to create relation diagram from out ndb model. By deferring the rendering to dot and using Python's reflection this became an easy task. Some links are still missing since we're using ancestor queries, but this can be handled by some class docstring syntax or just manually editing the resulting dot file.

Tuesday, July 05, 2016

Friday, June 10, 2016

Work with AppEngine SDK in the REPL

Working again with AppEngine for Python. Here's a small code snippet that will let you work with your code in the REPL (much better than the previous solution).
What I do in IPython is:
In [1]: %run initgae.py

In [2]: %run app.py

And then I can work with my code and test things out.

Monday, May 30, 2016

Using ImageMagick to Generate Images

One of the exercises we did this week in the Python workshop used the term "bounding box diagonal". I had a hard time to explain it to the students without an image. Google image search didn't find anything great, so I decided to create such an image.

First I tried with drawing programs, but couldn't make the rectangle a square and the circle non-oval. Then I remembered imagemagick, I have it installed and mostly use it to resize images - but it can do much more. A quick look at the examples and some trial and error, and here's the result.


And here's the script that generated it:

Saturday, April 16, 2016

Waiting for HTTP Server - Go Testing

I don't like mocking in tests. If I have a server to test, I prefer to start and instance and hit the API in my tests. Waiting for the server to start with a simple sleep is unpredictable. I prefer to start the server, try to hit an URL until it's OK or fail after a long timeout. This is a simple task with Go's select and time.After.

Tuesday, March 29, 2016

Slap a --help on it

Sometimes we write "one off" scripts to deal with certain task. However most often than not these scripts live more than just the one time. This is very common in ops related code that for some reason people don't apply the regular coding standards to.

It really upsets me when I try to see what a script is doing, run it with --help flag and it happily deletes the database while I wait :) It's so easy to add help support in the command line. In Python we do it with argparse, and we role our own in bash. Both cases it's extra 3 lines of code.

Please be kind to future self and add --help support to your scripts.

Friday, March 11, 2016

vfetch - Fetch Go Vendor Depedencies

Go 1.6 now supports vendoring. I found myself cloning dependencies to "vendor" directory, then cloning their dependencies ... This got old really fast so vfetch was born. It's a quick and dirty solution, uses "go get" with a temporary GOPATH to get the package and its dependencies, then uses rsync to copy them to the vendor directory.

Installing is the usual "go get github.com/tebeka/vfetch" then you can use "vfetch github.com/gorilla/mux".

Comment, ideas and pull requests are more than welcomed.

Tuesday, March 08, 2016

Super Simple nvim UI

I've been playing with neovim lately, enjoying it and a leaner RC file. nvim comes currently only in terminal mode and I wanted a way to spin a new window for it. Here's a super simple script (I call it e) to start a new xfce4-terminal window with nvim.

Tuesday, February 23, 2016

Removing String Columns from a DataFrame

Sometimes you want to work just with numerical columns in a pandas DataFrame. The rule of thumb is that everything that has a type of object is something not numeric (you can get fancier with numpy.issubdtype). We're going to use the DataFrame dtypes with some boolean indexing to accomplish this.

In [1]: import pandas as pd  

In [2]: df = pd.DataFrame([
   ...:     [1, 2, 'a', 3],
   ...:     [4, 5, 'b', 6],
   ...:     [7, 8, 'c', 9],
   ...: ])  

In [3]: df  
Out[3]: 
   0  1  2  3
0  1  2  a  3
1  4  5  b  6
2  7  8  c  9

In [4]: df.dtypes  
Out[4]: 
0     int64
1     int64
2    object
3     int64
dtype: object

In [5]: df[df.columns[df.dtypes != object]]
Out[5]: 
   0  1  3
0  1  2  3
1  4  5  6
2  7  8  9

In [6]:   

Saturday, January 23, 2016

Forging Python - First Chapter is Up

Finally, first chapter of my upcoming book "Forging Python" is up. I'm doing it leanpub style so comments ans suggestions are more than welcomed.

I plan to finish the book this year, hopefully during the summer. However more than one person said I'm way too optimistic - time will tell :)

Tuesday, January 05, 2016

353Solutions - 2015 in Review

Happy new year!

First full calendar year that 353solutions is operating. Let's start with the numbers and then some insights and future goals.

Numbers

  • 170 days of work in total
    • Work day is a day where I billed someone for some part of it
      • Can be and hour can be 24 hours (when teaching abroad)
    • There were total of 251 work days in 2015
    • There were some work days that are not billable (drafting syllabuses, answering emails ...) but not that many
  • 111 of days consulting to 4 clients
    • 1st Go project!!!
  • 58 days teaching 14 courses
    • Python at all levels and scientific Python (including new async workshop)
    • In UK, Poland and Israel

Insights

  • Social network provided almost all the work
    • Keep investing in good friends (not just for work :)
  • Workshops pay way more than consulting
    • However can't work from home in workshops
    • Consulting keeps you updated with latest tech
  • Had to let go of a client due to draconian contract
    • No regrets here, it was the right decision
    • Super nice team. Sadly lawyers had final say the company
  • Python and data science are big and in high demand
  • Delegating overhead to the right person helps a lot
    • Accounting, contracts ...

Future Goals

  • Keep positioning in Python and Scientific Python area
  • Drive more Go projects and workshops
  • Works less days, have same revenue at end of year
  • Start some "public classes" where we rent a class and people show up
    • Some companies don't have big enough data science team
    • Need to invest in advertising
  • Publish my book (more on that later)

Blog Archive