Kamil Choudhury

#define ZERO -1 // oh no it's technology all the way down

Python Profiling for Babies

If you have ever written a big stack of Python code and wondered why it is stupidly slow, you need a profiler. My go-to choice is pyinstrument, which samples your program while it is running and prints out pretty trees of where your program is spending time.

Installing it is pretty easy:

pip-3.6 install pyinstrument --user

And then you pop into your python code and have it sample the code you want to profile:

def big_terrible_piece_of_code()

    from pyinstrument import Profiler
    profiler = Profiler()
    profiler.start()

    # Do a whole of stuff

    profiler.stop()
    print(profiler.output_text(unicode=True, color=True))

That'll output a big old graph like this:

2.449 None  None
└─ 2.431 _handle_and_close_when_done  gevent/baseserver.py:24
   └─ 2.431 handle  gevent/pywsgi.py:1498
      └─ 2.431 handle  gevent/pywsgi.py:442
         └─ 2.431 handle_one_request  gevent/pywsgi.py:592
            └─ 2.431 handle_one_response  gevent/pywsgi.py:945
               └─ 2.431 run_application  gevent/pywsgi.py:907
                  └─ 2.431 __call__  flask/app.py:1995
                     └─ 2.431 wsgi_app  flask/app.py:1952
                        └─ 2.431 full_dispatch_request  flask/app.py:1600
                           └─ 2.431 dispatch_request  flask/app.py:1578
                              └─ 2.431 wrapfn  wwwd.py:214
                                 └─ 2.431 view_home_thread  wwwd.py:952
                                    └─ 1.452 <listcomp>  wwwd.py:995
                                       └─ 1.426 __next__  openarc/graph.py:404
                                          └─ 1.397 next  openarc/graph.py:169
                                             └─ 1.209 _set_attrs_from_cframe  openarc/_rdf.py:344
                                                └─ 0.818 add  openarc/_rdf.py:144

The most expensive code paths are near the top of the tree -- optimize those first if you can.

There are other tools in my optimization toolkit, but this is the one I find myself turning to most often. It is insanely easy to use, and in my book that trumps sophisticated tools any day of the week: sometimes a general, birdseye view is all you need to notice that your program is doing something stupid.

Previously: performance is not an accident