A Twisted-Python Developer’s Experience With Node.JS

The Story

Last weekend I was flown out to San Francisco to compete in the University Hacker Olympics hosted by SignalFire. Around 100 university students split in to teams of 2-5 students and worked with engineers from various companies on projects of their choosing. After 26 hours we presented our products to a panel of judges who would select the winning team. While there was a prize, I was just there to have a good time and meet some awesome people. Needless to say, I had a great time.

My Team

I got to work with some splendid folk from the Samsung Accelerator. Our project idea was a website and android app to track ping pong matches between office employees and a leaderboard for each office. As Jason had given a talk on Node.JS earlier, we decided to build the backend with Node as a learning exercise. It’d be a simple REST api for the app, with data storage handled by MongoDB, and live updating of the website through Socket.IO. All very simple and straight-forward.

The Good

Node has an amazing community, and an easy to use package manager. We quickly got started using Express to make our REST api easy to build and our static content even easier. Mongoose made our database persistance equally easy, and Socket.IO was… well actually not great. Regardless, if we needed something done there was a package to help us get there.

The Bad

Node, like Twisted, is an event based asyncronous framework. To achieve this it uses callbacks like so:

1
2
3
4
5
6
7
doSomething(function(error, result) {
  if(error) {
      // Handle the error
    } else {
        // Handle the result
    }
})

This works fine when one level deep, but what if your callback has to call another asyncronous function with a callback? What if that also has a callback? Your code starts flowing to the right faster than it flows down, and you enter callback hell.

To prevent this, there are a number of libraries that tranform the callback syntax into a chainable sequence. Jason, our Node guru, recommended Q. Q introduces promises which, like Twisted’s deferreds, allow easily chaining callbacks and error handlers. For example:

1
2
3
4
5
6
7
8
9
10
doSomethingAndReturnAPromise()
.then(function(result) {
  // Do something with the result and return another promise
})
.then(function(result) {
  // Do something with the result of the last callback
})
.catch(function(error) {
  // Uh-oh, one of our three functions broke!
})

While they aren’t exactly like deferreds, they serve a similar purpose and work rather well. Problem solved, right?

The Ugly

Not quite. While Q provides promises, no library returns them natively. Which means that every one of those lovely libraries provided by the caring community has to be wrapped in promises manually if you want to avoid callback hell. This negates a substantial portion of the benefit gained by having the libraries in the first place.

Conclusion

I’m really glad I actually tried Node. I’d been fairly set in my ways as a user of Twisted, and saw no need to switch. Now that I’ve tried it, I realize that many of my arguments were just plain wrong. For instance, developing in javascript isn’t as terrible as I thought it would be since the most frustrating bits are actually messing with the DOM, and there are good libraries to make it easier to write good javascript. While maturity of the event loop is also a concern, it’s rather abstract and not something I’m qualified to argue.

What I can say for certain is that the lack of deferreds as a core component of node has led to a critical flow control problem. For now, I’ll be sticking with Twisted and it’s beautiful inlineCallbacks.

Automatic Resume PDF Generation

Earlier I posted about using Prose.IO and Travis-CI to allow posting on my blog from anywhere with little hassle. However, I also host my resume on this site and wanted to be able to easily update it as well. While I could edit it like any other post from Prose, I also wanted a PDF copy for companies seeking a downloadable version and for uploading to other sites.

The question became “How do I automatically convert my markdown resume into a PDF?” A few google searches later and I stumbled on Alan Shaw’s Markdown-PDF package. This package converts markdown to HTML, renders it with PhantomJS, then exports the rendered site to a PDF. Simple, a bit crude, but it works. To aide in making the documents look good, the package includes the base CSS of Twitter Bootstrap.

To make the generated resume look professional, I utilized some of the options Markdown-PDF provides. The first was using the “Preprocess Markdown” option to remove the YAML header and PDF download link from the file. Markdown-PDF doesn’t understand the YAML header that Jekyll requires so I had to manually remove it, and there was no point to having a PDF download link in the PDF itself. Next, I processed the generated HTML to turn all links into plaintext. I found that they just made no sense if you couldn’t click them, and they styling was distracting. Finally, I used the web font “Oxygen” to add some style.

Finally, getting Travis-CI to generate the PDF is rather simple. Every Travis-CI VM includes Node.JS by default, so I just had to add npm install markdown-pdf and node generate_resume.js to the before script instructions. My resume is generated and put in the proper place, and then Octopress deploys it as normal. Simple!

generate_resume.js
pdf.css
.travis.yml

Using Prose.IO to Post From the Cloud

When I first chose Octopress, I was delighted that I could quickly and easily create posts in markdown. I wanted to ensure that it would be as easy as possible to create content, so that I’d have no reason not to post whatever I felt like. However, the biggest issue I had was that it required running jekyll locally to do so.

Today I decided to fix this stumbling block, and began to look for a frontend for Octopress. What I found instead was Prose.IO. Prose isn’t a frontend for Octopress, it is a way to edit any file in any of your github repositories easily. It just so happens that this is all it takes to allow me to create posts straight from my browser without any fussing with jekyll or git. I create a post on Prose.IO, which then commits to github, which is then processed by Travis.CI, which then commits back to github, which deploys the site. It all just works.

If this sounds like something you’d want for yourself, you can checkout this handy guide by Rogerz Zhang. It got me squared away in about 20 minutes. Just note that the format of the Prose.IO configuration in _config.yml has changed. My settings look like the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#prose.io settings 
prose: 
  rooturl: "source" 
  metadata: 
    source/_posts:
      - name: "layout"
        field:
          element: "hidden"
          value: "post"
      - name: "title"
        field:
          element: "text"
          label: "Title"
          value: "Post"
      - name: "categories"
        field:
          element: "text"
          label: "Categories"
          value: ""

On Click-fueled Javascript Games

alternatively: ClickQuest and Cookie Clicker, A tale of two games that start with a single click

Before I start, I recommend you at least glance at the games for a minute or two.
http://www.clickquest.net/
http://orteil.dashnet.org/experiments/cookie/

I made ClickQuest almost exactly 3 years ago, far past the time required for me to regret every line of code I wrote. However, I did learn quite a few things in its bug-riddled construction, and many things in the years of reflection afterwards. ClickQuest was originally marketed as “distilling an MMORPG into its most core mechanic – click and numbers go up”. Indeed, Cookie Clicker takes this a step further: Click to increase a number that you can decrease to increase faster. It is the most instant of gratification, a split second is all it takes.

However, while this makes the games extremely good for single-player, it makes it incredibly hard for multiplayer. If you play by yourself, there is no reason to cheat, as all you do is lower your own enjoyment. When multiplayer (more specifically a leaderboard) is introduced, then players are motivated to cheat. In MMORPGs the layers of abstraction provide a mechanism to defeat cheating. But in clicking games, we’ve peeled away those layers, and so our only action in the game is clicking. This is the biggest strength and greatest weakness in clicking games.

When I made ClickQuest, it was designed with a leaderboard and chat in mind. I knew that cheating would be a big concern, so I tried to proactively thwart it. Here’s what I tried in order of when I tried it.

Approach 1: Send every click to the server

This worked great in testing. If I saw every click then nobody could fake it, right? In reality, this just changed cheating from faking clicks to spamming my server with HTTP requests. And when your game generates a request for every click, you’ve just invented a way to DDOS your own server. Oops.

Approach 2: Rate limit clicks client-side

I changed the client to only accept a click when 60ms had passed since the last click. I found this was a reasonable upper bound for how fast a human can click a mouse, even with a mouse in each hand alternating clicks. (Yes, I had some hardcore players.) But this just slows down the cheater, assuming they don’t go past it straight to the server.

Approach 3: Obfusicate the request payload

Aha, if players can bypass the client to cheat, I need to force them to use the client. Simple! Make the packet sent from the client to the server super special secret so that they can’t duplicate it! And of course, obfusicate my javascript so they can’t find out how I do it! A great idea in theory, but it is pretty trivial to unpack javascript. In reality, this did nothing.

Approach 4: Track as many metrics as I can and figure out where they should be server-side

Finally, a proper solution. I kept track of how many clicks per second a user made whenever they made a request to the server. I’d then compare that to how many clicks per second they’d made in the past 100 requests. If your clicking was consistently close to the 60ms per click mark, I’d give them a strike. Likewise if they clicked faster than 60ms mark, or if they were clicking at the same rate constantly. This actually led to catching bots, but unfortunately it had too many false positives and had to be turned off.

Approach 5: Give up and let it happen

By this point I’d spent 6 months and had no idea what else I could do. It is extremely difficult to regulate pure clicks. If I were to attempt it today, I’d use the efficiency of websockets to combine approach 1 and 4. I’d log the location and time of every click for every user, and use that to try and weed out (poorly designed) bots. But even that would likely fail. The developer simply lacks enough information to properly prevent cheating.

But Cookie Clicker is different than ClickQuest!

Orteil has added back a layer of abstraction in the form of buyable cookie producers. Now, past a certain point, it’s the buildings producing 99% of the cookies – clicking is irrelevant. Using my numbers a player can legitimately acquire 20 cookies per second from clicking. If you open a websocket connection for each player, you can track whether they have the page open or not. Then, send a packet every 30 seconds, or on building purchase (whichever happens first). Each packet would have the number of cookies and each type of building. You can verify the buildings with stored data, and the cookies with simple arithmetic. Boom, cheating solved!

But not quite. You’ve just slowed botting down to human levels. Like an MMORPG, it’s possible to bot as long as you play by the rules. From here, you’d have to use browser fingerprinting, or other esoteric metrics to weed out bots. Just remember, the stricter your anti-cheat protection is, the likely you are to anger your legitimate players!

A few parting thoughts

I love javascript clicking games. They seem extremly simple, but can hide great ingenuity. (Also, they ease my pain of not knowing flash). But I feel you need to know what you’re getting into when making one. Unlike flash, or a desktop app, it’s easy for the most basic of programmer to read your entire source code. A little mucking around in dev tools and they can pry your game apart. I find this a great thing, as it allows others to learn from what they find, but it can lead to the problems I mentioned above. Again, just know the limitations of your platform.

They Say the First Step Is the Hardest…

I’ve been a web developer for many years now, and during that time I’ve never really had a site of my own. It always struck me as odd that for someone whose daily routine consisted of making websites, I couldn’t even make one for myself. The sites I made were for a purpose – they provided a service or a piece of information – and I couldn’t figure out what the purpose of a personal site would be. Last year I made a single page site as a portfolio, and I was immediately disgusted with it. I’m a developer, not a designer. The representation of myself as an artist instead of a problem solver just didn’t feel right. So I deleted it and returned to having nothing to say “This is me”.

Over the past year I’ve more firmly integrated myself into the open source world. I started heavily using github, and began to collaborate with others. As a part of this process, I found many interesting programmers who were much smarter than me. They all had one thing in common, they had a blog. A few days ago, I found a blog post saying that “comments are a facade fooling people into thinking they have a voice. They don’t. You only have a voice if you have a blog.” While perhaps a bit extreme of a claim, it made me realize that comments are much like posting anonymously: It’s very unlikely that people will use it to build an image of you. Your comments are spread over many websites, sometimes with different usernames. It’s hard for people to build them into a complete picture of you, unless they are the NSA. A blog, on the other hand, is a monolithic compendium of who you are. If somebody reads your blog, they can build an image of the kind of person you are. Perhaps a flawed and incomplete image, but an image nonetheless.

And there I had my answer. A portfolio is silly, I can just link you to what I’ve made. After all, they are websites. A blog is the way for me to actually share the struggles I have, what I learn from them, and hopefully I can teach you something as well. And just maybe, it’ll lead to me being a better programmer. I look forward to seeing what comes of it.