Sunday, September 27, 2015

My Summer in Sunny San Diego

Let me preface this post with a warning: it's gonna get technical. It's just not possible to talk about an entire summers worth of software development without getting too technical for the average reader, so I apologize in advance. (Google is your friend!)
ViaSat's very military logo

So I had an amazing experience doing software development for ViaSat Inc this past summer, and I want to share some of the awesome technologies I got to work with! I was working with a team of two other interns---one of which, also goes to Harvey Mudd. Our initial task was to develop the equivalent of Netflix's "Chaos Monkey" but for ViaSat's core switch infrastructure. We had to switch gears completely when we got hit with the ridiculous policy that no interns could be signed into the core switch's NDA because of the 'likelihood we may work for a competitor in the future.' ... Wait? Isn't that the entire purpose of an NDA? Hmm... So we came up with a solution: Instead of writing a program like Chaos Monkey that would run various tests on the development infrastructure, we would write a highly extensible framework that allows the developers to write tests that would be run on the development infrastructure. Write code to run other people's code!

Even though our users were other developers, we wanted to make things as easy as possible for them to increase the likelihood of our project actually being useful. So we decided to allow the users to run ANYTHING! ... (that runs on Linux). Figuring out how to make this work involved learning a slew of shiny new system administration and cloud tools including Docker, Ansible, OpenStack, and did I mention Docker?

In the end, our final product was split amongst three repositories.

1. The Deployment Repository

This repository mainly consisted of Ansible playbooks and a Dockerfile. Ansible playbooks are similar to bash scripts in what they can accomplish, but they are wildly different in how they are written and how much simpler their syntax is. The playbooks would boot up a configurable number of slave and master VMs. They would configure these VMs according to their tasks, and finally pull and deploy our web interface repository, and our scheduler repository. In the end of a deploy--which could take up to 30 minutes--our entire infrastructure would be up and fully functioning. Ansible needs a few dependencies to be able to interface with Openstack, so I wrote a Dockerfile with all the dependencies that would automatically run the deployment process. This way, users would only need to have Docker installed to deploy our VM cluster.

2. The Scheduler Repository

The view from my office (just imagine a beautiful sunny day)
This repository took was deployed in the form of another job run on the infrastructure. It's task was to communicate with the web app and the database server, schedule the jobs to run randomly based on each job's configuration, and then go to sleep until it was needed next. In order to wake the scheduler up early, I had to write a signal handler, and a signal sender, that would ssh into the container that the scheduler job was running, and send the unix alarm signal (SIGALRM) to the running process. It was kind of hacky, but also very cool to figure out!

3. The Web Interface Repository

This repository contained mostly Python code, HTML, and Javascript. Our web framework was called Flask, and for pretty javascript forms and buttons we used Foundation. I'm most proud of my docker backend that I wrote to allow the web app to use the docker api and to start images for the ssh configuration feature. I used docker-py to do the api integration, and tutumcloud's images for the ssh serving docker images. This task was the most challenging part of the project. Some functions required communicating between various VMs, generating and serving secure ssh keys, and carefully testing for and catching errors. I needed to discern between server errors and user's dockerfile errors, which the system would need to handle. In the end, I am incredibly happy with how it turned out.

In hindsight, I would say that our team did a good job of avoiding hitting too many major roadblocks. We managed to be quite productive all summer. I had an incredibly fun experience at ViaSat and I'm looking forward to doing more software development in the future.

Thanks for reading!


Friday, March 13, 2015

Voxel Engines

First off, a Voxel Engine is a game engine that uses voxels, and voxels are simply 3d pixels. One of the most popular games that runs on a voxel engine is Minecraft.

A landscape from Minecraft
If you don't know, (how could you not know?!), Minecraft uses massive blocks (voxels) to form a highly sandboxed universe where players work together to build their own virtual worlds. To me, Minecraft seems to be just a demonstration of the incredible potential behind voxel-based sandbox games. I was enthralled in the game throughout high school. It was another world. A world where I was completely free to build whatever wherever and with whoever. I met people from all over the world and worked together to create complicated logistics systems and redstone circuitry, as well as beautiful mansions and cabins. All this, with voxels that are actually half the height of the characters!

Nowadays, voxel engines are getting much more complex and much more refined. These engines are able to achieve much more complicated worlds with more optimized algorithms written in higher performing programming languages than Java, which is the language Minecraft is written in. I believe that if voxel engines can continue to get more and more powerful, while remaining entirely customizable and modular, that they could soon become the future of virtual reality and video gaming in general.
A landscape from Voxel Farm

I recently stumbled upon the incredibly impressive voxel engine called Voxel Farm. Voxel Farm is written in C and C++ and is a highly advanced engine that allows relatively intuitive world building in a much more continuous sandbox world than Minecraft. As you can see from the image on the left, Voxel Farm looks just like a typical gaming engine. In many ways, Voxel Farm is a typical gaming engine. For example, it can output meshes in real time and work in tangent with many popular game engines like Unity or Cryengine, however in its backend, it is still a fully voxelized world.

Voxel Farm is being developed as a game engine and not as a game. While this is an obvious point, it definitely makes a huge difference. The developers are targeting other developers with much of their programming time, which will have some interesting effects. Now, I'm sure you might say, well how come someone couldn't simply use this game engine to build the next Minecraft? While I agree that this is a distinct possibility, it simply doesn't offer the huge potential that you get when you are in control of absolutely everything about a game. It is a freedom that cannot be matched, and it is one of the main reasons why Minecraft was such a success. Also, it is worth mentioning that the Pro package distributes the full source code to the buyer, so it is technically possible to change the inner workings of the game engine, however I have doubts that even this will be enough.

Despite my doubts, I really hope we get to see some great games come out of this project. And either way, I think the work that is being put into Voxel Farm is an excellent investment. It is pushing the limits of what voxels are capable of, and because of this, I believe it has some incredible potential! So look out for the future of Voxel Farm because it could be making some waves in the world of gaming in the next couple years.

Wednesday, March 4, 2015

More Cellular Automata

I'm back after a little more developing on my CA program, (see my previous post entitled 'Cellular Automata'), and a very educational conversation with reddit user /u/slackermanz, someone with much more experience making cellular automata that me. (Seriously, look at /r/cellular_automata, they have some very impressive patterns on there, and most of them are made by /u/slackermanz).

The modifications I have made to my program are making it such that each cell will not move, but will be able to eat cells in its near vicinity, and reproduce in a random direction. As mentioned previously, I implemented an energy value, which each cell starts out with, and if they run out of energy, they become food for other cells. If the cell has an excess of energy, they will reproduce. This produces the following pattern.

As you can see, there is an exponential expansion, and I am working to minimize this, since I think there will be more interesting patterns in an energy starved environment.

Another problem I am working to fix is an over-lapping of cells. Somehow in my code, multiple cells are able to occupy the same space, which is causing an over-creation of cells and lagging my program. I believe to make this more efficient, I will initially spawn cells for all spaces, then have my program set the cells in a state of dead, alive, or food, that way the number of cells is constant.

As always my code is on my github. Stay tuned for these modifications and optimizations in the near future, until then, thanks for reading.

Monday, March 2, 2015

What I did over winter break, and why it was a massive waste of time.

I want to talk about the myriad of obstacles that I encountered over winter break (2014 - 2015). Ever since the end of my senior year of high school, after writing a pretty dinky little 2D physics simulator, I had been really wanting to give it another dimension, but held back by the complexity of it. 3D graphics and physics gets quite complex, so simplify the enormous task, there are a host of tools that one could use, ranging from closed source production-quality game engines like CryEngine or Unity to the free and open-source tools that I chose to use, Ogre3D and Open Dynamics Engine. I chose the open sourced libraries because, (a) I exclusively use Linux, and Unity isn't available for Linux and (b) I am a Linux SysAdmin, what do you expect?.

An image of what some of the starter code rendered
This Ogre is a pain in my ass!
So at the beginning of my sophomore year, I decided upon Ogre3D after learning Cinder was unfortunately Windows and OSX only. So I tried to start working through the beginner tutorials but the provided starter code wouldn't even compile. The starter code's Makefile was just broken, with about four lines of options needing to be added to the autotools makefile.am and configure.ac. After getting in touch with a developer, who kept insisting I use CodeBlocks, a graphical IDE (ew gooey), I noticed that there were some very helpful comments on one of the autotools configuration pages that included exactly what I needed to get the starter code compiling. My question is, why hadn't these changes been made to the code? Finding this and getting it to work took up most of the first half of my semester, just to get the starter code compiling!

So I went into winter break hoping that my experience would smooth out, or that I would get over some threshold and finally understand how to make Ogre do exactly what I want in an intuitive way. After all it is object based, why shouldn't it be intuitive? Well, long story short: I was wrong. Ogre's syntax is incredibly verbose and complex. It's objects are not used in an intuitive way--it's data structures are sort of hybrids between self contained objects and nested-scoped static methods. For example, I struggled to be able to simply print to the program log, and that is because printing requires the following line of code in Ogre:

    Ogre::LogManager::getSingletonPtr()->logMessage("### Error parsing config file ###");

 "Ogre::LogManager::getSingletonPtr()" is a call to a static method within the LogManager scope, which is itself within the Ogre scope. That method will return a pointer to the Ogre::LogManager object, which can only be initialized once in the program. Then you can use that pointer to call logMessage(), which will add your message to the log. Here is the initialization of the LogManager object:

    Ogre::LogManager* logMgr = new Ogre::LogManager();
    Ogre::Log* myLog = Ogre::LogManager::getSingleton().createLog("myLog.log", true, true, false);

Honestly, I don't even have a complete understanding of what this code is really doing, I mean is logMgr a deferenced LogManager pointer to a pointer to the new LogManager object? What? After that line, we can completely ignore the existence of the logMgr object, because of course we will just use getSingleton() whenever we need to log. (what was the point of even making us write that line then?) But initializing the LogManager wasn't enough, because we need to create the log file, and put it into myLog, which we will never use again. If the logManager object was only meant to be initialized once, and used through getSingleton(), then why even make us be so verbose about it. These kinds of options should not need to be written explicitly. I mean, if I wanted to rename my log's filename, or specify whether to add debugger output to it, then I should be able to via these complicated ways. However, if I just want to print some message and don't really care, I am forced to add these highly complicated lines of code to my file just to do that. 

My entire winter break went like this. There was no way I had the time to fully understand all of the highly complicated syntax of Ogre, while also trying to tie in Open Dynamics Engine.

Open Dynamics Engine (ODE) is a rigid body physics engine, written in C. Naturally, it's interface was  incredibly different than Ogre's. While it was much less complicated and slightly much more straightforward, I ran into serious difficulties when trying to get it to play nice with the big bad Ogre. For example, all the calculations ODE ran would be returned to my program in the ODE-defined 3d coordinate object const dReal* and I was tasked with the requirement to decompose that object and turn it into what Ogre wants, Ogre::Vector3 s, in order for the object to be re-rendered in the corrected location. It was a constant battle trying to make the two get along, and even today they still don't behave. 

Well that got long and complex... In summary, I dove head first into the deep end of two very complicated and very different C and C++ libraries, and I emerged with a buggy hunk of code, from which I understand <60% and have written <40% myself. I previously thought that if I forced myself to use C++ then I would be a better programmer than all those cheaters using the easy languages like Python. I even started the break off with a few OpenGL tutorials before realizing that it was far too low level for the kind of things I wanted to do. Clearly, the same was true for Ogre and ODE. I learned C++ and C are production-level languages and programs that are even mildly complicated are usually written by many people over the course of many months, not by a hobbyist experimenting with 3D programming with a low work ethic (winter break is a break after all). 

All of this is ironic, of course, because the reason this blog exists is because of my research at ASU. At ASU, I was researching the differences between Java and C++ and concluded that C++ was more geared towards businesses with more value in their software and many more programmer (wo)man-hours to devote to it, NOT lazy CS majors trying to have some fun making a realistic 3D world. I guess this whole experience was a lesson in believing the results of actual research... 

Well, thanks for reading, and if you have any questions feel free to comment and I will try to answer them if I can. 

-Jeff

Saturday, February 28, 2015

Cellular Automata


What is Cellular Automata? To define it I will take directly from wolfram:
"A cellular automaton is a collection of "colored" cells on a grid of specified shape that evolves through a number of discrete time steps according to a set of rules based on the states of neighboring cells. The rules are then applied iteratively for as many time steps as desired."
It is pretty much the study of Conway's Game of Life. And there's a lot more to study than you'd expect. There are repeating patterns which return to their original states after a certain number of steps, and patterns in stable states that just don't move at all. The most interesting of all are the stable patterns that move, called spaceships or gliders, they will work their way across the grid until they collide with other things or just go off screen. Here are some examples.

A poor lost and lonely cell
So I decided to make my own implementation, with a twist. I would like to make the cells just slightly smarter than your average Conway's Game of Life cells. My rules are much more complicated that the typical rules in a cellular automaton, but I believe that it will still be possible to study some interesting interactions among the cells.

So, to each cell I have added the properties of hunger, and energy. I plan to have a cell that is out of energy and food to become food itself, and I plan to make cells that are fully energized and full of food to make new cells. Perhaps eventually the reproduction will require other cells around, but for now they will reproduce asexually. You can see the code on my Github.

At the moment my project is sort of halfway there, and needs another day or two of love and attention. I will try and get some work done this week and perhaps post my progress here when that happens. Thank you for reading, and feel free to comment if you have questions or comments.

Revival of my blog

Even though I originally created this blog to update my professors and advisers on my senior research project with Dr. Rida Bazzi at ASU, I have decided that the URL jeffmilling.blogspot.com is a perfect URL to use as a personal blog. Why let it go to waste?

I will be posting interesting things that I learn about, and small projects that I am working on. I can't guarantee regular posts, but I do feel like maintaining a blog is a good way to document stuff that I am working on and things that inspire me. These will likely be various things from my ridiculous range of interests. To list a few of the topics that I would like to post about: quantum information, quantum field theory, voxel programming, virtual reality, artificial intelligence, cellular automata, graphics programming, and last but not least linux. So thanks for reading, and I hope you learn a thing or two.

Friday, April 19, 2013

Multi-Core Computing

Most modern Central Processing Units (CPUs) are multi-core, and are advertised as such. For example, Intel's processors are usually advertised as "Intel i3 Dual-Core Processor" or "Intel i5 Quad-Core Processor" and even more recently they now have "Intel i7 Hexa-Core Processor." These processors can get a bit extreme, but what does it really mean, let's say, to run a Hexa-Core Processor? Well, on the right you can see a hand-drawn example of a single-core of a CPU, this core can execute a certain number of operations-per-clock-cycle (which used to be 1, but has since increased). Operations can be in the form of a logical operation (like ADD, AND, or XOR) on two binary values of a certain size and then reporting the result to wherever it needs to be. There are many different forms of CPU operations, but most of the actual computing is spent doing the previous example. Processing is just a bunch of math. Adding a second core to a CPU, as you should expect, can theoretically double your operations-per-clock-cycle value. Adding four can quadruple that value. Adding six can sextuple it, and so on and so forth. Within the last year, an exciting new engineering startup company by the name of Parallella began work on a 64-core processor, connected in a square matrix and multiplexed together. The future of multi-core computing is definitely an exciting one.

So at this point, it seems like the more cores, the better. Right? Not exactly, as there are many complications when it comes to programming for multi-core architectures. As an example, take a simple fibonacci sequence calculation. Inside the main loop, the current calculation, which is to add the two previous numbers, relies on the two previous numbers to already have been calculated, and so on and so forth. This greatly limits the amount of multi-tasking that is possible. So computer programs need to specify when multi-core, or "threaded" operations is allowed.

Below are two graphs showing ASU's computer's cores and their usage in a percentage. The graph on the top was recorded during testing of C++ programs, and on the bottom are Java Programs.

CPU usage per core during testing of C++ Applications


CPU usage per core during testing of Java Applications
So, clearly, there are staunch differences in these two pictures. So what exactly is happening that is causing these differences? 

Well, the difference lies in the languages. C++ requires the programmer to explicitly say when operations can be threaded, and when no specific allowances are written, no multi-core optimization takes place. On the contrary, Java allows the programmer to define specific threaded operations, but it does not require that in order to use multi-core optimization. Java's VM will run on multiple cores and distribute operations as they are compiled and executed. So, in conclusion, while C++'s graphs look clean and tidy, and Java's look like a mess, Java is actually optimizing it's code and taking more advantage of the CPU's architecture. 

So now I need to mention the program samples I ran. The CSE students wrote the programs to perform correctly, not efficiently. Maybe some over-achievers might optimize their code, but most students would just try to make the program do what it is supposed to, and when that works they turn it in. If this code were written by an actual software company, like Microsoft or Apple, they would almost certainly invest time in optimizing their code with multiple core architecture in mind. Students on the other hand, would not. 
This brings to mind some pros and some cons. On one hand professional companies have slightly more control over threaded optimization using C++, but on the other hand Java's automatic optimizations make programming much more convenient and can cut down significantly on runtimes of programs that maybe were written by one programmer who otherwise wouldn't have had the time to program lines and lines of code for threaded optimizations.

Of course, this topic will be more deeply analyzed and explained in my upcoming presentation.

Thanks for reading.
- Jeff