Friday, November 14, 2014

Direct link to host in Nagios via Apache Rewrite

One of the people using our nagios server asked the other day:

"I know this doesn't work, but can  https://nagios.our.site/ourhost go directly to the page for host?"

I explained that the frames inside the current version of Nagios that we use were simply
links to cgi urls like

https://nagios.our.site/cgi-bin/status.cgi?type=1&host=ourhost

and that you could type that url into the browser and get a direct link. As I was explaining this I realized that it would not be too hard to use Apache Rewrite to create a url like this

https://nagios.our.site/host/ourhost

This is what I added to the nagios.conf file in /etc/httpd/conf.d to accomplish this.
 
 RewriteEngine On
 RewriteLog /etc/httpd/logs/ssl_rewrite_log

 # Rewrite host/foo to basic cgi view url
 RewriteRule   ^/host/(.*\.cgi)$               /cgi-bin/$1                [L,PT]
 RewriteRule   ^/host/([a-z0-9\-]+*)$          /cgi-bin/status.cgi?host=$1 [L,PT]
    

The first rule rewrites any links clicked from with the initial page to correctly redirect to the appropriate CGI, the second rewrites the /host/ourhost to be a CGI url pointing to the basic status page.  The options are important, the first L means "last rule, stop attempting to rewrite the url", PT means that it is not a plain file, but needs to go back through the URI engine to be resolved to a cgi.

It's a simple hack, but people seem to like it. Which is probably more a statement about the Nagios dashboard than the utility of this tweak.

Monday, September 15, 2014

Elixir command line parsing

I haven't had much luck finding a complete example of command line parsing in Elixir, so I thought I'd share what I've come up with so far.

This example shows uses a Map to ultimately store the options as key, value pairs. This implies that there can be only one value for any  given option.


  def main(args) do
      args |> parse_args |> process
  end

This function defines the main flow of control in the program.

  def parse_args(args) do
    options = %{ :count => @max_ofiles ,
                 :type  => @default_type
                 }
    cmd_opts = OptionParser.parse(args, 
          switches: [help: :boolean , count: :integer],
          aliases: [h: :help, c: :count])

    case cmd_opts do
      { [ help: true], _, _}   -> :help
      { [], args, [] }         -> { options, args }
      { opts, args, [] }       -> { Enum.into(opts,options), args }
      { opts, args, bad_opts}  -> { 
                                   Enum.into(merge_opts(opts,bad_opts),
                                      options), 
                                      args
                                     }
      _                        -> :help
    end
  end

The main command line argument parsing function. It sets up an option map with default values and then merges in any values from the command line. It also allows undefined options, see rehabilitate_args below.


  def merge_opts(opts,bad_opts) do
    bad_opts |>  rehabilitate_args |> Keyword.merge(opts)
  end 


A simple helper function to make the main parse routine less complicated.


  def rehabilitate_args(bad_args) do
      bad_args 
     |> 
      Enum.flat_map(fn(x) -> Tuple.to_list(x) end)
     |> 
      Enum.filter_map(fn(str) -> str end, 
                      fn(str) -> 
                        String.replace(str, ~r/^\-([^-]+)/, "--\\1") 
                      end )
     |>
      OptionParser.parse
     |> 
      Tuple.to_list
     |>
      List.first
  end

The function rehabilitate_args is something I've included since my application will use plugins and I would like plugin authors to be able to simply use command line args w/o a complicated interface. This might or might not be a good idea and is mostly included as an example of how to handle bad_args if you want to. If you use :strict rather than :switches in the OptionParser.parse call undefined options will automatically generate an error.


  def process(:help) do 
    IO.puts @module_doc
    System.halt(0)
  end
  
  def process({options,args}) do
     AllTheRealWork.start
  end
 

This is an example of using elixir's pattern matching in function definitions to allow it to route flow of control in the program. The value returned from the pargs_args call will determine which version of the process function gets run. This code in actual use can be found in my github repo https://github.com/bbense/elixgrep

Friday, September 12, 2014

Elixir spawn/1 and spawn/3

Since I spent an afternoon banging my head against this, I thought I'd blog it.

If you have elixir code that dies with this message


17:39:25.661 [error] Error in process <0.60.0> with exit value: {undef,[{'Elixir.MyFunModule',fun_stuff,[],[]}]}


when you do this

  pid = spawn( MyFunModule, :fun_stuff, [] )

But runs just fine when you do this

 pid = spawn( fn -> MyFunModule.funstuff([]) end )

The problem is the arity of the function you are using in the spawn/3 version.
Elixir uses the count of the elements in the third argument to spawn/3 to
find the function to spawn in the subprocess. In the first version it is looking
for a MyFunModule.fun_stuff/0 which does not exist. If you change

  pid = spawn( MyFunModule, :fun_stuff, [] )
to

  pid = spawn( MyFunModule, :fun_stuff, [[]] )
it will then run without the process error above.

Thursday, August 21, 2014

Multi-Core, Languages and the Dreaded GIL

For a very long time[1], I have always thought that the only way to deal with the coming multi-core explosion is to move to a language that deals with concurrency as a core feature in the language. This is hardly a novel idea. It has always been obvious that threaded code in standard languages is next to impossible to write correctly. This implies that the only way to deal with threads is to hide them in an abstraction in which you can devote enough resources to ensure a reasonably robust implementation. Writing thread safe code is extremely difficult and requires a huge payback to justify the effort. It's exactly the same reasoning behind allowing the kernel to do memory and process management, rather than having every application write it's own OS.

In the face of this I have made a serious effort to learn Erlang, Go and dip my toe in functional languages in an attempt to be ready for adapting to this multi-core world. As much as I love Ruby,
it was obvious that it simply would not be a reasonable solution for a machine with 32 cores and above. It and many of the other "scripting" languages, depend on having a lock on the dreaded GIL[2]. This is generally not a problem for code that runs websites due to the fact that they are largely I/O bound[3]. However, for any other workload that requires CPU these languages would be relegated to the "fancy shell script" problem domain. You simply wouldn't write anything serious in these languages, since they can't effectively use 80-90% of the available resources on a high end server.

Recently, I have begun to rethink this assumption.  The reality is that most 32 core machines are running VM's in one form or another. Docker[4], in particular, provides a highly efficient means of simulating a single core OS for a single core process. A common theme of the "multi-core" ready languages is to avoid the problems of threading by switching the model from shared memory to very fast messaging. If it is possible to create an interprocess message system between Docker images that can compete with speed of Go channels or Erlang messaging, then it should be possible for languages like Ruby to be a reasonable alternative in this brave new world.

There was an attempt to allow this kind of distributed processing in Ruby, DRb. It never really gained wide usage primarily due to the latency and AAA[5] issues. However, if there was an API that allowed communication between Docker instances that could be both fast and trusted, it would be possible to revive Drb and similar "remote object" implementations. I think this is the only workable way forward for the "interpreted" languages. Essentially you have to convert the message sending primitive to sending a message to remote[6] objects. This isn't as hard as it looks at first blush, Docker images are simply processes running in the same kernel on the same machine. They just have different ideas about their various interfaces to the rest of the world. It should be possible to "short-circuit" these interfaces to allow them to communicate at something close to the speeds available to more concurrency aware languages.

[ After a couple days of thinking about what this really means.]

This idea looks more like "sprinkle current fairy dust" on a hard problem, than an actual solution. I think what I was trying to wrap my brain around was the idea of implementing something like the old sun RPC interface via standard ports as a way to allow you to use Docker to avoid all the overhead of
doing message passing between ruby processes via pipes, etc...

I think there is something there, but it's far from fleshed out enough. Maybe just creating a lightweight micro services API and using Docker to allow those API's to talk on the same machine
might be enough. I am well aware of the first rule of Distributed Objects.

http://martinfowler.com/articles/distributed-objects-microservices.html






[1]- Since about 1997 or so (i.e. when I first attempted to write pthreaded code ).

[2]- Global Interpreter Lock https://en.wikipedia.org/wiki/Global_Interpreter_Lock

[3]- i.e. Turning database entries into web pages and web forms into database entries.

[4] - https://www.docker.com/

[5]- Authentication, Authorization  and Accounting

[6]- If they are in the same memory accessed by the same cpu, they really aren't "remote"

Friday, November 8, 2013

If you really think your employee's performance is on a bell curve, you should fire your HR department.

A recent kerfuffle over in Yahoo Land has got me thinking about the mis-application of statistics.

http://allthingsd.com/20131108/because-marissa-said-so-yahoos-bristle-at-mayers-new-qpr-ranking-system-and-silent-layoffs/

The assumption is that employee performance follows a bell curve. But you only get a bell curve in your sample of a larger population if your sample is random and the larger population is also a bell curve.

What this means is that if your employee's performance truly follows a bell curve, then you should fire your HR department and completely replace all those expensive people with a random process that just picks a resume out of the hat. Frankly, I suspect you'd be hard pressed to tell the difference in most organizations.

Thursday, October 10, 2013

Person Capital and the Government Shutdown


One of the mantras of the information age is

"People are Capital"

In other words, the experience and knowledge of your workforce is a significant capital investment. It is the only capital investment that can walk out the door whenever it get's pissed off enough and right now all the best people employed by either directly or indirectly by the federal government are polishing their resumes and checking just exactly who they know on LinkedIn. I know because that's exactly what I've been doing.

The great lie that has been put out there is that government jobs are cushy and easy and only incompetent people would take them. Well, one way or another I've been employed by the Federal government indirectly for the every job I've ever had since I stopped waiting tables, cooking or washing dishes in 1987. The reality is that most people in the government work long hours for relatively low wages given their expertise and education. Despite every attempt to belittle it on Fox News, Public service is alive and well. Many of the people not getting paychecks are hard working competent people that
could make more money in the private sector, but choose to work for the federal paycheck because they believe in public service. For me, it matters that what I do advances the knowledge of mankind in some way. 

I remember a conversion in high school (1978) about the future and I said my "dream job" would be to work at a "national lab doing physics research". One of my friends that had more exposure to the larger world said "Those are terrible jobs". 

It's really odd that we were both right. My job is my dream job, but it is also terrible in that I never get the resources to do it right.  The current drama in Washington is the last straw for many of us; the best people currently employed by the government are all looking around for stable paychecks.

Monday, September 16, 2013

Chef Encrypted Data Bags Are A Code Smell.


Chef databags are a centralized key/value store for json based data in the Chef environment. Encrypted databags are exactly what is advertised on the box, the data is encrypted with an external key before uploading to the data store.

Encrypted databags are meant to help solve the problem of dealing with identifier information that you want automatically installed, but that you need to keep private. Examples are database login passwords, the private key of a public key pair and other tokens that allow a server access to additional services.

Encrypted databags provide protection against two kinds of access:

Implicit Access:

This is defined to mean access outside the chef protocol to the underlying data store on your chef server. If you're using a service like Hosted Chef or just practicing general good data hygiene, DB access to the server should only expose "public" data if at all possible.

Explicit Access:

Explicit access is via the standard chef protocol were using the ssl keypair created as part of the chef client bootstrap to access data in the chef datastore or via a knife command using an admin key pair.

The problem with encrypted databags in both use cases is that they only appear to solve the underlying problem by moving it out of the chef workflow. In order to use either protection effectively, you need to create an "out of band" system to manage the shared secret required to access the databag.

In the implicit access protection case, this is likely a worthwhile and manageable cost since the key only needs to be secret from the chef server. However, this only protects against read-only implicit access. If the bad guy has write implicit access to your chef datastore, the game is over on your next client chef run.

The explicit access case is where the real problems arise. In this case, the intention is to prevent some admins/hosts from access to private data. This however creates an unfortunate side effect in that access to the shared secret becomes an "invisible access control list". Using an encryption key as an authorization object creates problems since you destroy the chain of identity. All you ever know is that "somebody with the key" accessed the data. You need to create a separate tracking and access control system to provide an audit-able trail. Encrypted databags don't solve a problem in this case, they just create a whole new set of problems.

Chef Vault is an attempt to get around this problem by creating access control lists using the available private key on each chef client. Without the strong ACL system that either Private or Hosted Chef provide, this is probably the simplest workable solution. Any other solution should implement all of the features of Chef Vault. The most general solution in the explicit access case is ACL's based on the ssl identity of the client. There are many other objects in Chef on which the ACL system of the Opscode Chef would provide useful security boundaries.

It's important to remember that more keys is not better security. Encryption keys should be used to provide data integrity, privacy and authentication. (i.e. they answer the who questions, who are you? who am I? who sent that message? ). They should never be used to answer the what questions. ( what can I read? what can I write?, etc ).

Providing secrets in a scalable and secure fashion is still a largely unsolved problem. But any solution that attempts to use only shared secret encryption will not scale. Public key encryption and rings of shared access seem to be the only workable way forward and every chef client and admin has key pair already available. Any scalable solution should be using this existing identity. If you are using encrypted databags to control explicit access, then you are building in scaling, access and audit problems for the future.