Tuesday, February 24, 2015

Writing Effective Examples in Elixir docstrings.

I had an email exchange on the elixir-core mailing list recently that I think is worth preserving as a blog post. It shows a great approach to how to write examples for functions with complex output.

We had a meetup this weekend for hacking the docs. The intent was to add examples to the standard library. 

I started looking around in code.ex and the question I run into is the following:

Should examples be functionally accurate or readable? 

Here's what I mean. I was attempting to add an example for 


This is what I came up with. 

## Examples

      iex> Code.get_docs(Atom, :docs)
      [{{:to_char_list, 1}, 36, :def, [{:atom, [], nil}],
      "Converts an atom to a char list.\\n\\nInlined by the compiler.\\n\\n## Examples\\n\\n    iex> Atom.to_char_list(:\"An atom\")\\n    'An atom'\\n\\n"},
      {{:to_string, 1}, 20, :def, [{:atom, [], nil}],
      "Converts an atom to a string.\\n\\nInlined by the compiler.\\n\\n## Examples\\n\\n    iex> Atom.to_string(:foo)\\n    \"foo\"\\n\\n"}]
      iex(1)> Code.get_docs(Atom, :all )
      [docs: [{{:to_char_list, 1}, 36, :def, [{:atom, [], nil}],
      "Converts an atom to a char list.\\n\\\nInlined by the compiler.\\n\\n## Examples\\n\\n    iex> Atom.to_char_list(:\"An atom\")\\n    'An atom'\\n\\n"},
      {{:to_string, 1}, 20, :def, [{:atom, [], nil}],
      "Converts an atom to a string.\\n\\nInlined by the compiler.\\n\\n## Examples\\n\\n    iex> Atom.to_string(:foo)\\n    \"foo\"\\n\\n"}],
      moduledoc: {1,"Convenience functions for working with atoms.\\n\\nSee also `Kernel.is_atom/1`.\\n"}]
It's really messy to read and I'm still working on getting the quoting correct for ex_doc. 
Atom is about the simplest standard module available, so picking anything else just 
gets worse. Also it suffers from the problem that it doesn't automatically update when
the docs for Atom update. 

My initial thought was to dynamically create a Sample module and then use Code.get_docs
on that example ( i.e. similar to the test for Code.get_docs ).  

So I guess the question is: 

Should any examples in the documentation exactly match the results from running the code in iex? 

i.e. while the standard lib does not use doctest as far as I know, any examples included should be
doctest aware. Or is it okay to simplify the results for clarity? 

- Booker C. Bense

José Valim 
Feb 23
Should any examples in the documentation exactly match the results from running the code in iex?

Yes. But you can always "cheat"!

# Get the doc for the first function
iex> [fun|_] = Code.get_docs(Atom, :docs) |> Enum.sort()
# Each clause is in the format
iex> {{_function, _arity}, _line, _kind, _signature, text} = fun
# Let's get the first line of the text
iex> String.split(text, "\n") |> Enum.at!(0)

So you are showing the whole process without printing it all the way.

This is a really nice way to take very complex dense output and recast it in a simple example that makes the return value of the function much clearer. 

Thursday, January 8, 2015

Using ex_doc to create a github web site for your elixir project.

1. Use mix docs to create docs dir.

You'll need to add these to the deps section for your project in mix.exs

[{:earmark, "~> 0.1", only: :dev},
     {:ex_doc, "~> 0.5", only: :dev}]

Then run

mix deps.get
mix deps.compile
mix docs

This will create a doc subdir with an html documention for your project very similar to
that used by the standard elixir docs.

What you need to do know is to install those html pages into the special git branch that
allows you to create a user.github.io/project website.

2. Follow manual gh_pages instructions on github.


git clone git@github.com:user/project.git project-gh-pages

cd project-gh-pages

git checkout --orphan  gh-pages

cp -r ../project/doc/* .
git add --all

( Note this is about the only time I would ever use git add --all )

git commit -a -m "Ex doc output"
git push origin gh-pages

After a few minutes the web pages should show at the url


Friday, November 14, 2014

Direct link to host in Nagios via Apache Rewrite

One of the people using our nagios server asked the other day:

"I know this doesn't work, but can  https://nagios.our.site/ourhost go directly to the page for host?"

I explained that the frames inside the current version of Nagios that we use were simply
links to cgi urls like


and that you could type that url into the browser and get a direct link. As I was explaining this I realized that it would not be too hard to use Apache Rewrite to create a url like this


This is what I added to the nagios.conf file in /etc/httpd/conf.d to accomplish this.
 RewriteEngine On
 RewriteLog /etc/httpd/logs/ssl_rewrite_log

 # Rewrite host/foo to basic cgi view url
 RewriteRule   ^/host/(.*\.cgi)$               /cgi-bin/$1                [L,PT]
 RewriteRule   ^/host/([a-z0-9\-]+*)$          /cgi-bin/status.cgi?host=$1 [L,PT]

The first rule rewrites any links clicked from with the initial page to correctly redirect to the appropriate CGI, the second rewrites the /host/ourhost to be a CGI url pointing to the basic status page.  The options are important, the first L means "last rule, stop attempting to rewrite the url", PT means that it is not a plain file, but needs to go back through the URI engine to be resolved to a cgi.

It's a simple hack, but people seem to like it. Which is probably more a statement about the Nagios dashboard than the utility of this tweak.

Monday, September 15, 2014

Elixir command line parsing

I haven't had much luck finding a complete example of command line parsing in Elixir, so I thought I'd share what I've come up with so far.

This example shows uses a Map to ultimately store the options as key, value pairs. This implies that there can be only one value for any  given option.

  def main(args) do
      args |> parse_args |> process

This function defines the main flow of control in the program.

  def parse_args(args) do
    options = %{ :count => @max_ofiles ,
                 :type  => @default_type
    cmd_opts = OptionParser.parse(args, 
          switches: [help: :boolean , count: :integer],
          aliases: [h: :help, c: :count])

    case cmd_opts do
      { [ help: true], _, _}   -> :help
      { [], args, [] }         -> { options, args }
      { opts, args, [] }       -> { Enum.into(opts,options), args }
      { opts, args, bad_opts}  -> { 
      _                        -> :help

The main command line argument parsing function. It sets up an option map with default values and then merges in any values from the command line. It also allows undefined options, see rehabilitate_args below.

  def merge_opts(opts,bad_opts) do
    bad_opts |>  rehabilitate_args |> Keyword.merge(opts)

A simple helper function to make the main parse routine less complicated.

  def rehabilitate_args(bad_args) do
      Enum.flat_map(fn(x) -> Tuple.to_list(x) end)
      Enum.filter_map(fn(str) -> str end, 
                      fn(str) -> 
                        String.replace(str, ~r/^\-([^-]+)/, "--\\1") 
                      end )

The function rehabilitate_args is something I've included since my application will use plugins and I would like plugin authors to be able to simply use command line args w/o a complicated interface. This might or might not be a good idea and is mostly included as an example of how to handle bad_args if you want to. If you use :strict rather than :switches in the OptionParser.parse call undefined options will automatically generate an error.

  def process(:help) do 
    IO.puts @module_doc
  def process({options,args}) do

This is an example of using elixir's pattern matching in function definitions to allow it to route flow of control in the program. The value returned from the pargs_args call will determine which version of the process function gets run. This code in actual use can be found in my github repo https://github.com/bbense/elixgrep

Friday, September 12, 2014

Elixir spawn/1 and spawn/3

Since I spent an afternoon banging my head against this, I thought I'd blog it.

If you have elixir code that dies with this message

17:39:25.661 [error] Error in process <0.60.0> with exit value: {undef,[{'Elixir.MyFunModule',fun_stuff,[],[]}]}

when you do this

  pid = spawn( MyFunModule, :fun_stuff, [] )

But runs just fine when you do this

 pid = spawn( fn -> MyFunModule.funstuff([]) end )

The problem is the arity of the function you are using in the spawn/3 version.
Elixir uses the count of the elements in the third argument to spawn/3 to
find the function to spawn in the subprocess. In the first version it is looking
for a MyFunModule.fun_stuff/0 which does not exist. If you change

  pid = spawn( MyFunModule, :fun_stuff, [] )

  pid = spawn( MyFunModule, :fun_stuff, [[]] )
it will then run without the process error above.

Thursday, August 21, 2014

Multi-Core, Languages and the Dreaded GIL

For a very long time[1], I have always thought that the only way to deal with the coming multi-core explosion is to move to a language that deals with concurrency as a core feature in the language. This is hardly a novel idea. It has always been obvious that threaded code in standard languages is next to impossible to write correctly. This implies that the only way to deal with threads is to hide them in an abstraction in which you can devote enough resources to ensure a reasonably robust implementation. Writing thread safe code is extremely difficult and requires a huge payback to justify the effort. It's exactly the same reasoning behind allowing the kernel to do memory and process management, rather than having every application write it's own OS.

In the face of this I have made a serious effort to learn Erlang, Go and dip my toe in functional languages in an attempt to be ready for adapting to this multi-core world. As much as I love Ruby,
it was obvious that it simply would not be a reasonable solution for a machine with 32 cores and above. It and many of the other "scripting" languages, depend on having a lock on the dreaded GIL[2]. This is generally not a problem for code that runs websites due to the fact that they are largely I/O bound[3]. However, for any other workload that requires CPU these languages would be relegated to the "fancy shell script" problem domain. You simply wouldn't write anything serious in these languages, since they can't effectively use 80-90% of the available resources on a high end server.

Recently, I have begun to rethink this assumption.  The reality is that most 32 core machines are running VM's in one form or another. Docker[4], in particular, provides a highly efficient means of simulating a single core OS for a single core process. A common theme of the "multi-core" ready languages is to avoid the problems of threading by switching the model from shared memory to very fast messaging. If it is possible to create an interprocess message system between Docker images that can compete with speed of Go channels or Erlang messaging, then it should be possible for languages like Ruby to be a reasonable alternative in this brave new world.

There was an attempt to allow this kind of distributed processing in Ruby, DRb. It never really gained wide usage primarily due to the latency and AAA[5] issues. However, if there was an API that allowed communication between Docker instances that could be both fast and trusted, it would be possible to revive Drb and similar "remote object" implementations. I think this is the only workable way forward for the "interpreted" languages. Essentially you have to convert the message sending primitive to sending a message to remote[6] objects. This isn't as hard as it looks at first blush, Docker images are simply processes running in the same kernel on the same machine. They just have different ideas about their various interfaces to the rest of the world. It should be possible to "short-circuit" these interfaces to allow them to communicate at something close to the speeds available to more concurrency aware languages.

[ After a couple days of thinking about what this really means.]

This idea looks more like "sprinkle current fairy dust" on a hard problem, than an actual solution. I think what I was trying to wrap my brain around was the idea of implementing something like the old sun RPC interface via standard ports as a way to allow you to use Docker to avoid all the overhead of
doing message passing between ruby processes via pipes, etc...

I think there is something there, but it's far from fleshed out enough. Maybe just creating a lightweight micro services API and using Docker to allow those API's to talk on the same machine
might be enough. I am well aware of the first rule of Distributed Objects.


[1]- Since about 1997 or so (i.e. when I first attempted to write pthreaded code ).

[2]- Global Interpreter Lock https://en.wikipedia.org/wiki/Global_Interpreter_Lock

[3]- i.e. Turning database entries into web pages and web forms into database entries.

[4] - https://www.docker.com/

[5]- Authentication, Authorization  and Accounting

[6]- If they are in the same memory accessed by the same cpu, they really aren't "remote"

Friday, November 8, 2013

If you really think your employee's performance is on a bell curve, you should fire your HR department.

A recent kerfuffle over in Yahoo Land has got me thinking about the mis-application of statistics.


The assumption is that employee performance follows a bell curve. But you only get a bell curve in your sample of a larger population if your sample is random and the larger population is also a bell curve.

What this means is that if your employee's performance truly follows a bell curve, then you should fire your HR department and completely replace all those expensive people with a random process that just picks a resume out of the hat. Frankly, I suspect you'd be hard pressed to tell the difference in most organizations.