What Does Fibonacci Look Like in Elixir?

I recently had a conversation about Elixir since I have been using it more and it was not a language this person was familiar with. As curious engineers, one basic question that arose was, "what does Fibonacci look like?". I was happy to comply and provide some code.

As a quick refresher, we want to write a function that takes in an integer that represents the nth number in the Fibonacci sequence. For this implementation, we're assuming that the input is a non-negative integer. Okay, let's go!

Recursive

So the typical recursive solution has a function fib(n) and we return an integer of 1 if n is equal to 0 or 1 and otherwise, we want to recursively call fib with the last two previous indexes.

def fib(0), do: 1
def fib(1), do: 1
def fib(n), do: fib(n-1) + fib(n-2)

Happy days! We just need three lines to implement this! Now, if you recall my previous post on Elixir function conditionals, that is the same thing we're doing here. We're returning 1 if we pass in a 0 or 1 index, otherwise we'll just do the recursive logic that we want. That's it!

Iterative

"So what about an iterative solution?" you ask? Yes, that's actually what was discussed next anyhow. So, the typical solution for an iterative solution is to have a loop and have two variables track the previous and current values. So, that's fine, but something just felt wrong with that not being quite Elixir-y. So, I thought about it for a bit and came up with the following solution.

  def iter_fib(0), do: 1
  def iter_fib(1), do: 1
  def iter_fib(index) do
    Enum.reduce(2..index, [1, 1], fn(_i, acc) ->
      # Calculate the Fibonacci value
      fib_val =
        # Take the accumulator
        acc
        # Flatten the list since we're appending fib_val by a new list
        |> List.flatten()
        # Sum those values
        |> Enum.sum()
      [Enum.take(acc, -1), fib_val]
    end)
    # This is now the Fibonacci value for the index and its previous value, so just take the last value
    |> List.last()
  end
So, what I've done here is create a list that represents the Fibonacci sequence. We have the same overloaded function signatures for index values of 0 and 1, and otherwise, the bulk of our code goes into our Enum.reduce/3. What we are doing is constantly keeping the list length at 2 so we can easily just sum the values and compute the next Fibonacci value. My first implementation actually was a bit memory hungry because I was just appending my list continuously and then taking a sum of the last two values. Why keep the list long if you only want to compute that last value right?

And Bob's Your Uncle

That's it! Plain. Simple. Memory smart. Anyhow, you can checkout the full source here in this gist. Cheers!

By Adrian Cruz | Published Aug. 19, 2017, 10:08 p.m. | Permalink | tags: elixir

Elixir Conditionals with Function Signatures

On Learning Elixir

So, I've been learning and doing a bit more Elixir lately. I'm only a couple months in, but Elixir has been the primary language I have been coding in at work. Happy days!

This is not an introductory write-up nor a tutorial, but rather a quick look at how conditionals and guards are done in Elixir. Onward.

Let's Take a Look at FizzBuzz

So, yes, FizzBuzz; the classic programming problem and still a favourite interview question. As a quick refresher, this game is played by counting from one through n, but for multiples of three, we'll have "Fizz", multiples of five, "Buzz", and multiples of both, "FizzBuzz". I think that is the simplest I can word it.

Let's have a look at an example in Python first.

def fizz_buzz(n):
    if 0 == (n % 3) and 0 == (n % 5):
        return "FizzBuzz"
    elif 0 == (n % 3):
        return "Fizz"
    elif 0 == (n % 5):
        return "Buzz"
    else:
        return str(n)
There is nothing crazy going on here, the logic is simple and it does exactly what we need it to. It uses the usual if/else logic that you would expect.

So, now let's look at a solution in Elixir.

def fizz_buzz(n) when 0 === rem(n, 3) and 0 === rem(n, 5) do
  "FizzBuzz"
end
def fizz_buzz(n) when 0 === rem(n, 3), do: "Fizz"
def fizz_buzz(n) when 0 === rem(n, 5), do: "Buzz"
def fizz_buzz(n), do: n
If you've never done any Elixir before, I'm sure you may be a bit confused.

Function Signatures as Conditionals

So, let me first say that, yes, if/else does exist in Elixir. "Then, why isn't it used here at all?", you ask? Well, Elixir has a good foundation of having their functions be explicit in what they do and being able to pipe your code is very useful. I won't write much about the pipe operator here, but if you don't know much about it in Elixir, I highly recommend learning about it.

Guards

So what we see in the Elixir FizzBuzz is an overloaded fizz_buzz/1 function. As I'm sure you've looked at it and studied it a bit by now, yes, that is being done in lieu of if/else. The important piece is the when part of the function signature which is called a guard. Using guards, we now have logic for which function we want to match on when we pass in the variable n. So now we see that each function signature matches up exactly to what we have done in Python with if/elif/else logic. Pretty neat eh?

By Adrian Cruz | Published March 14, 2017, 7:06 p.m. | Permalink | tags: elixir

Always a Student, Always Learning

Lessons from the martial arts world

There's this saying that jiu-jitsu practitioners say now and then that "a black belt is just a white belt that never stopped learning". That rings very true with me and the way I approach life. I don't claim to be an expert in anything. I am a lifelong student and will try my best to continue learning things day by day.

But what does this all mean really? Am I constantly just learning new things and not polishing my expertise in something? Well, no. I mean to say that there will always be something to learn as long as you allow yourself to learn. Going along with this martial arts theme, there is also this phrase that gets said; and that is to "empty your cup". If you consider yourself to be an expert in anything, do you just sit back and stop learning about whatever field you are an expert in? No. Always look to innovate and always look to improve.

Teaching is also learning

So Katie Cunningham gave this wonderful closing for PyGotham this year. Hopefully the video of it will be up some time. It was truly wonderful. But my takeaways from it were more of a reinforcement of what I've already been trying to do: speak more and teach more. I'm quite the introvert so I tend to shy away from conversation. But, I challenge myself to speak to new people and speak more at conferences and meetups. For an introvert, this is extremely scary. But, not only do I try and speak more, I am trying to be a better teacher. Okay, so not your typical classroom teacher, but a teacher in the sense that I am knowledge transferring something to someone else. So for instance, obviously at work we want more cross-functional teams and more open collaboration, so any of my work should be easily picked up by any of my team mates. But in order for me to do so, I need to be a good teacher. Not a teacher that just says, "here's some code, RTFM now!".

I've noticed that I definitely have found myself seeking opportunities to improve my teaching skills. Whether it be a small chit chat with a colleague on what's new in the world of tech or just being a good role model for my kids, I am pushing for a very easy, relaxed conversation that everyone will enjoy with no fear, no pressure.

Phrasing matters! One thing that I've taken extra care in doing so, is the way I will converse with others. For instance, publicly shaming someone is never a good idea. "This piece of code is wrong" versus "Can you tell me how this piece of code works? I think something looks odd here" is a good example of the different language choices you can make to sound more amiable.

Student of life

Knowing that there will always be something new to learn is always reassuring for me. For that very reason is why I enjoy being an engineer; technology is always changing and I am there learning to keep up. But obviously, occupation isn't the only part of your life where you'll constantly learn; growing up is really just a continuous learning process. Constantly learn and tune those dials, you too, are a student of life, learning all sorts of new things daily.

By Adrian Cruz | Published July 31, 2016, 5:02 a.m. | Permalink | tags: engineering advice, pygotham

Writing Out Files & Python UnicodeEncodeError Woes

A very common headache that I am sure every engineer has had to face at least one time in their life is character encoding. Oh, yes that fun topic! No, I do not have a solution for everyone. Sorry! But, in case you are in the Python world like I am and you are writing out files and getting a bunch of UnicodeEncodeError, well try the following below that I have.

tl;dr Show me an example!

import codecs

with codecs.open(filename, 'w', encoding='latin-1') as outfile:
    outfile.write('{}\n'.format(json.dumps(data, encoding='latin-1')))

So, what? And why?

Okay, so I am writing a latin-1 encoded file. Yes, latin-1. Why? Well, I've chosen latin-1 here because latin-1 was giving me issues, so there! But really, if you want to write to a different encoding, obviously just swap that out.

But the long explanation, if you were curious, is that I am reading in data that was latin-1 encoded and it was making my data ingestion jobs fail because I default to utf-8. The json.dumps() bit is actually not needed if you are not working with json (obviously!). But, I wanted to point out that in case you were writing json, you also need to set the encoding to whatever you choose there as well. It is currently on my TODO list to see why that is the case.

By Adrian Cruz | Published March 21, 2016, 9:50 p.m. | Permalink | tags: python

Intro to Building Out Data Pipelines With Python and Luigi

A very common question that I have been getting asked is, "Luigi? What's that?". Well, my answer that I usually give, in brief, is that it is a project open sourced by Spotify to facilitate workflow and dependencies. But to quote the Luigi ReadTheDocs page:

Luigi is a Python package that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization, handling failures, command line integration, and much more.

Luigi Tasks & Targets

The main pieces of Luigi are built around a Task and a Target. A Task is exactly what you would think it would be, it's a single task in your data pipeline. So for example, you may have a task that reads in some csv file and pull out specific values that you want from the file. The Target, is the intended output for your Task. So, back to our csv file example, an output() Target for that task may be a cleaned up csv file with the values you wanted.

Here Is Our Example Task

import luigi


class MyTask(luigi.Task):
    def output(self):
         return luigi.LocalTarget('my_output.csv')

    def run(self):
        with open('input.csv', 'r') as input:
            cat_count = 0
            for line in input.readlines():
                animal, age, color = line.split(',')
                if animal == 'cat':
                    cat_count += 1

        with self.output().open('w') as out_file:
             out_file.write(cat_count)

That is a [dumb] simple Task. This task has no requirements. All that it does is read in a file, parse it and write out its output to another file.

A few tings to note are the importance of output(). Luigi checks to see if the output() exists to check if this task is complete. That is Luigi's definition of complete. You can also override complete() if you do not have an output, but for now just think that every task needs an output.

"So, what is this good for?"

So one big thing that I purposely left out when describing Luigi, is that it integrates really well into the whole Hadoop ecosystem! So, now let's take a step back and think about how we would process these batch data processing jobs without Luigi...

Let's say we have several jobs that need to be accomplished in order for the task you want to be considered complete. So for example, maybe we have a need to process some data that we have found in some log files. The log files are currently stored in S3, so we'll have a job to fetch those locally. Maybe, the logs need to be cleaned up a little bit, so we will do whatever filtering and et cetera transformations we'll need to do to cleanse the data. Next, we'll want the newly formatted data logs loaded into HDFS. After that, we can utilize Hive, so we'll want to create a table with those logs.

Data jobs in the past

So in the past, we would just have these several jobs run in several cron jobs. But wait, depending on how much data we're working with, these jobs will have varying length of time to complete! So, the best you can do is see how long these jobs run and schedule them accordingly. So for example, we know the data is pulled down from S3 within ~20 minutes, so we'll schedule the next job 30 minutes later, and do the same thing for the remaining jobs as well.

Now, with Luigi

With Luigi, you create dependency chains for jobs very easily by overriding the requires() method. Now when you define your entire process, you want it to run in this order: Task0->Task1->Task2->Task3. So, you can now say that Task3 requires Task2 to run, Task2 requires Task1 to run, etc. etc. This looks like the following:

class Task3(luigi.Task):
    def requires(self):
        return [Task2()]
    
    """
    Other core code would go here as well, like run(), output()...

    """
Now, you have one single point to schedule and no need to guess when each job runs! Pretty neat right? :)

This is obviously just an introduction. I've only touched the surface about Luigi. But, if you need to build out data pipelines and enjoy doing so in Python, I highly recommend checking out Luigi! Cheers!

By Adrian Cruz | Published May 31, 2015, 7:57 p.m. | Permalink | tags: big data, hadoop, luigi, python