Side Projects

I tried to work on some side projects for quite a while, but until last year did I finally make something complete. Here I’m summarizing what I did, and will share more details in subsequent posts.

First of all, I’m reflecting on why previous efforts were fruitless.

The single biggest mistake that failed previous attempts is a lack of proper level of abstraction. For example:

  • Once I tried to write an efficient approximate counting program. The idea is simple: persisting an instance of the data structure as introduced in this paper for different streams, and providing a read/write interface. But, I mixed questions from different abstraction levels, and spent too much time to figuring out how to write a key-value store, instead of just using an existing solution and focus on implementing the core idea.
  • There’s another time when I tried to write “some webapp” using React. But without a clear idea of the end product in mind, I ended up implementing low level components (like a loading indicator) and stopped half way there.

Now looking back, the first thing to do for a side project is to define the problem. I’ll roughly categorize the problem space as:

  1. Build some actual functionality. In this case, reuse whatever is readily available, and focus on the business logic.
  2. Duplicate and validate some interesting idea. In this case, determine what’s the essence of that idea, and for things on the lower level, either reuse existing solutions or just plug in some naive implementation.

I’ll start with an example of 1. As I started learning Korean last year, I wish to have flashcard functionality not only for words but also for grammar structures and sentences. Since there doesn’t seem to be anything out there, I decided to take it as a side project. Then I figured out the tech stack (Django + React, on an EC2 host), and followed official tutorials to make a quick end-to-end prototype. Specifically, for frontend, I spent more time figuring out how to reuse available React components in NPM, than reinventing the wheels, and the former turned out to be more productive and helped me keeping the momentum.

As for 2, I’ve long been hoping to implement some ideas used internally at the company, but the complexity always scared me off, until I tried to extract the essence first.

One example is Flow (of a later version). The essence is to use an decorator to make functions “async”, i.e., it can take either plain arguments or a Future, and returns a Future. To make this work, I only need to make Future serializable, and distinguish between DAG construction mode and actual execution mode. For other components, I just used a naive FIFO queue (without a scheduler at all) to hold ready tasks, and executors simply pull from the queue.

A slightly different example is duplicating EventBase. For this one, I exactly want to get a sense of how low level libraries (especially libevent) work, but again it’s the definition of the “problem” instead of getting lost in endless details. And I further implemented a simple benchmarker based on my implementation of the EventBase, which benchmarks an RPC server based on the EventBase as well.

I’ll stop here for now, and use subsequent posts to further describe each project.

Leave a comment

Your email address will not be published. Required fields are marked *