My ultimately goal is to publish something almost daily. Unfortunately, until I develop all the surrounding work-flows publication will be a little slower. One of the key components that has cropped up early is a good way to run analysis code and insert matplotlib figures directly into the buffer. This can be accomplished relatively easily with some extra code and Org-babel.
Down the rabbit hole
In my blog.org
file I can include the following python source in a python Org-babel source block block:
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
from orgstats import get_dataframe
df, ti = get_dataframe("/home/kotfic/org2")
df.groupby("Level").size().plot(kind='bar')
plt.savefig("img/analysis-of-org-mode-headings/org-level-hist.png")
return "img/analysis-of-org-mode-headings/org-level-hist.png"
I can execute the block by using the C-c C-c
key binding on the NAME
or the BEGIN_SRC
lines. this produces the following figure:
There are three separate components here:
The first imports matplotlib, sets matplotlib's backend to 'Agg' and then imports the pyplot
library as plt
:
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
The second runs the actual bit of analysis:
from orgstats import get_dataframe
df, ti = get_dataframe("/home/kotfic/org2")
df.groupby("Level").size().plot(kind='bar')
Finally, the third saves the file out to disk using the savefig() function and returns the name of the file:
plt.savefig("img/analysis-of-org-mode-headings/org-level-hist.png")
return "img/analysis-of-org-mode-headings/org-level-hist.png"
This last statement may seem a little strange. it is an artifact of the ob-python export engine. Behind the scenes org will wrap the whole code block in a function and the function will return this value. if your org-mode block has :results file
set in its header argument then the result block will insert a link which can be in-lined in your emacs buffer and correctly exported as a markdown image link.
All together this extended example looks like this in my blog.org file:
#+NAME: org-level-hist-extended
#+BEGIN_SRC python :exports both :results file
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
from orgstats import get_dataframe
df, ti = get_dataframe("/home/kotfic/org2")
df.groupby("Level").size().plot(kind='bar')
plt.savefig("img/analysis-of-org-mode-headings/org-level-hist.png");
return "img/analysis-of-org-mode-headings/org-level-hist.png"
#+END_SRC
and produces the following results block:
#+RESULTS: org-level-hist-extended
[[file:img/analysis-of-org-mode-headings/org-level-hist.png]]
The problem is this preamble matplotlib code and this postamble org-mode code are quite distracting. An ideal solution would involve an executable python block that still produces the correct result block and only exports the relevant part of the code.
A good place to start is with org-mode's noweb syntax. Noweb lets you reuse code contained in other org-mode source code blocks through basic syntactic expansion. We can place our preamble code in a block like so:
#+NAME: plt-preamble
#+BEGIN_SRC python :results file :exports none
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
#+END_SRC
and our postamble code:
#+NAME: plt-postamble
#+BEGIN_SRC python :results file :exports none
plt.savefig("img/analysis-of-org-mode-headings/org-level-hist.png")
return "img/analysis-of-org-mode-headings/org-level-hist.png"
#+END_SRC
Then use the <<...>>
syntax to reference these code blocks in a block that does the actual analysis.
#+NAME: org-level-hist
#+BEGIN_SRC python :exports both :noweb strip-export :results file
<<plt-preamble>>
from orgstats import get_dataframe
df, ti = get_dataframe("/home/kotfic/org2")
df.groupby("Level").size().plot(kind='bar')
<<plt-postamble>>
#+END_SRC
In order to get the desired effect we set the noweb header argument value to :noweb strip-export
. This will export the code block and strip out the noweb references before exporting the relevant code the markdown. Great start!
The only nagging issue is specifying the file name of the image. plt-postamble as-is will always save to img/analysis-of-org-mode-headings/org-level-hist.png
. To solve this problem we can modify plt-postamble to take a variable:
#+NAME: plt-postamble
#+BEGIN_SRC python :results file :exports none
# "path" variable must be set by block that
# expands this org source code block
plt.savefig(path)
return path
#+END_SRC
The final block that produces the analysis then includes that variable (path) as a HEADER
argument. It looks like this:
#+NAME: org-level-hist
#+HEADER: :var path="img/analysis-of-org-mode-headings/org-level-hist.png"
#+BEGIN_SRC python :exports both :noweb strip-export :results file
<<plt-preamble>>
from orgstats import get_dataframe
df, ti = get_dataframe("/home/kotfic/org2")
df.groupby("Level").size().plot(kind='bar')
<<plt-postamble>>
#+END_SRC
Into the guts a little
This works because behind the scenes python is tangling all of these blocks out to a temporary file and then executing the file. The :var path=img/analysis-of-org-mode-headings/org-level-hist.png
header argument means org-mode will create a python variable named 'path' at the top of that file. The plt-postamble
block picks up that variable and uses it to save the image and return the correct file name for Org's inline image display and markdown export.
The above code block produces the following file into something like /tmp/babel-29898Xn/ob-input-2989q6W
:
def main():
path="img/analysis-of-org-mode-headings/org-level-hist.png"
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
from orgstats import get_dataframe
df, ti = get_dataframe("/home/kotfic/org2")
df.groupby("Level").size().plot(kind='bar')
plt.savefig(path);
return path
open('/tmp/babel-29898Xn/python-2989dwQ', 'w').write( str(main()) )
Now this can get a little confusing… There is an input file that starts with ob-input-[...]
and an output file python-[...]
. Emacs generates the input file by creating a function main()
and then appending any :var foo=bar
statements, expanding any noweb blocks and inserting the python code into the main function then finally appending the open(...).write( str(main()) )
python line. Once the file is saved to the tmp
folder it executes the python script in a separate process. The script produces the output file (e.g., python-2989dwQ
) and emacs reads in the contents of that file to get the results of the execution and inserts them into the org-mode buffer.
Normally this ob-input-2989q6W
file is deleted, you can bind org-babel--debug-input
to true (e.g., (setq org-babel--debug-input t)
) to keep it around if you want to directly debug the complete python script. I have found that to be a life saver.
whew
So with that little trip into the guts of ob-eval.el this block:
#+NAME: org-level-hist-example
#+HEADER: :var path="img/analysis-of-org-mode-headings/org-level-hist.png"
#+BEGIN_SRC python :exports both :noweb strip-export :results file
<<plt-preamble>>
from orgstats import get_dataframe
df, ti = get_dataframe("/home/kotfic/org2")
df.groupby("Level").size().plot(kind='bar')
<<plt-postamble>>
#+END_SRC
exports as:
from orgstats import get_dataframe
df, ti = get_dataframe("/home/kotfic/org2")
df.groupby("Level").size().plot(kind='bar')
Which is exactly what we were looking for.
Wrap up
An obvious question is "Why not IPython Notebook?" With mixed markdown and python code and exporting to html/markdown, IPython has excellent support for exactly this kind of work flow- plus I already use it almost every day. So why not write this blog in IPython like a real champ?
Ultimately (for me) org-mode provides more functionality and a better text editing experience; though clearly at the cost of increased complexity. Its pretty astonishing that these kinds of Literate Programming capabilities are available at all. With emacs I can write in text but also in python, R, and sql, I can make charts and graphs with ditaa and graphviz, I have built-in table support with excel-like features, and it all exports to HTML, ODT, PDF, LaTeX, Markdown and a half-dozen others.
Really… not to shabby for a text editor that was created in 1976.