Gareth Stokes

Daycoding for digital five and fetch tv

Nightcoding objective c, ruby and anything else thats shiny. 

Posts

September 30, 03:43 PM
You know the drill. Installments one and two of my portrait project were big hits, so here's part three. There seem to be a lot of folks out there willing to offer up their photos for me to destroy, and who am I to say no? Enjoy.


























- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

September 21, 01:10 AM

This blog is a revisiting of a previous post I made at my personal blog, and is also posted here.

Every so often, the hot topic of crunch/overtime/extra hours comes (read: anytime a jackass like me writes a post about it or the mainstream media decides to interpret what a jackass like me writes as a scathing commentary on the state of the industry). Sometimes, though, high profile people say something about overtime, and it incites a reaction, no matter what the intent.

not this kind of crunch, guys

It’s no surprise to anyone, in any industry, that overtime can creep into all facets of our lives. In the case of me being both a powerlifter and a video game developer, overtime has come in the form of overtraining and crunching, respectively.

For powerlifters, overtraining generally involves training your body so hard that it effectively cannot recover- your basal body temperature drops, your central nervous system begins to shut down. For game developers, crunching is typically a point in a project that requires developers to work 60, 80, or even 100 hours a week to hit a deadline, pushing our mental capacity to a limit. In both cases, you become lethargic and your sleep suffers, furthering the lack of recovery. The belief for both overtraining and crunch is that that the extra time put in will result in higher quantity and quality of results.

Our ability to cope with such overtime hinges heavily on the amount of time we have to recover from engaging in it.This recovery, whether or active (by, say, deep tissue massage or engaging in an unrelated interest) or passive (by relaxing on the couch or playing video games), can only work if those needing it are given the proper amount of time and resources to ALLOW it to work.

All The King’s Horses

Recently, I spent a large amount of time training for a national powerlifting competition. I trained for 5 months, competed in a local meet as a warm up, and then trained for another 2 months before the national meet. During the initial 5 months I worked very hard in the gym and took regular breaks from training so my body could recover, and I saw great gains in strength. At the local meet I set a personal best in the deadlift, and walked away happy.

He probably overtrained something

After the local  meet, I took only a few days off, and then decided that I had recovered from the first meet and the stress it had put on my body.  Since I had limited time before the national competition, I spent the next 2 months working even harder than I had before. Not only did I not take enough time off after the local meet, I eliminated any down time from my training, effectively giving my body no time to recover. After several weeks of this new training plan, I actually felt weaker. I started skipping workouts and making excuses to myself to not push it too hard. It got to the point where I didn’t even want to compete in the national meet anymore. I was burnt out, stressed out, and flat out tired. Even still, I set a goal to break my personal record for total weight. In fact, I planned on pushing myself to move more weight than I had ever done, even during training.

So what happened? I failed. I didn’t allow my body to recover from the incredible stress I was putting on it, and as a result when it mattered most I couldn’t compete at my full potential. I didn’t reach my goals, and the national meet was one of my worst in recent memory.

All The King’s  Men

Game devs are probably wondering what this meathead, lunkhead, Planet Fitness-reject’s story has to do with them. Well, to paraphrase Matthew McConaughey in “A Time To Kill,” just imagine if what I described above was the progression from a press demo to a final build (or, maybe the progression from shipping a game to working on the next one).

Imagine that the whole team gets behind polishing the absolute hell out of the first hour of gameplay. We spend a good 6 months of time getting the first 3 missions right. The scripting, the code hooks, the character performances, everything. We show it to the press during the months leading up to launch, and all of our attention is focused on that one hour of gameplay shining like a star.

We probably looked like MC Hammer on crack at times, too.

After all of that, the realization sets in that we have 3 months to finish the game. We have to get ALL of that hype into the rest of the missions, the gameplay, the general feel of the game. Maybe this mission here doesn’t have enough of what dude from Kotaku loved, and maybe those animations there don’t live up to the expectations of our publisher. So we start spending extra time revamping, changing, cramming in content. Features creep in. Hours start piling up. Some of us are thinking “I’m working extra hours because I believe in the project,” so when we hit a wall at 10pm, we push through it.

Suddenly it’s Thursday, 2 weeks before submission, and we’ve already put 50 hours in this week. We start to check in bad data. We break the build. We snap at co-workers. We’re no longer being smart or creative about making the game- instead we’re going through zombie-like motions to just get it done. We just want to ship this thing and move on to the next project.

And that passion we had? Starting to dwindle, if not gone already.

Sound familiar?

Putting It Back Together

The thing is, though, that it doesn’t have to be like this. There can be good overtime. For example, I’ve overtrained my deadlift and seen incredible gains by pushing myself on heavy days and taking a month away from the next heavy session. I’ve worked 50-60 hour weeks, with 5 (and rarely 6) days a week for no longer than 2 or 3 weeks to get a deliverable out the door, and have produced work that was higher quality and more rewarding, with no negative effect on my health or marriage.

What made the good overtime better than the bad?

After the bad overtime, I was done. I thought I had recovered enough, so when I tried to (physically and mentally) get back into both activities, I couldn’t. I was done, and wanted out. I considered quitting both powerlifting and game development after the last bad overtime experience with each.

In the case of the good overtime experiences, I was able to take the proper amount of time off that both my body and mind needed to recover not just AFTER the overtime, but DURING it. The work I put during these smaller pushes was of higher quality, more rewarding in the end, and most importantly, kept me engaged in I was doing and looking forward to getting back to it at full tilt as soon as I could.

It can be argued that if we want to be successful, we have to push ourselves harder than the average in our fields. It doesn’t, however, have to have a negative affect on the things we are passionate about. We’ve all read the reports, seen the opinion pieces, heard about EA Spouses and Kaos’ “thousand yard stare.” I’ve read articles on how overtraining has blown out knees, biceps, backs, and worse. Everyone universally agrees that too much overtime is bad- Bad for your health, bad for relationships, bad for studio morale, bad bad bad.

Smell The Roses

Don't let life pass you by

Overtime exists and it’s not going away. I’m not suggesting that it does. I’m not going to rant about crunch time ruining lives. I’m not going to claim that my life has been horribly affected by working overtime or training too hard.

I am, however, going to say this- we all need to manage it better.

I don’t mean that we need to plan better (we know), or avoid feature/exercise creep (we try), or never put in overtime (we will). I mean that we as individuals need to manage how we represent ourselves while working overtime. We need to be conscious of the fact that people who are interested in what we do (powerlifting, game development, insert-your-interest-here) are going to look at us as an example. They’ll see us doing stupid things in the gym or working 100 hours a week, and see us wearing both of those things like honor badges. They’ll see us tweeting about how we’re “crunching to make the game better for you, the consumer!”, or read our Facebook post about how we just totally killed a training session and can’t walk right now- but hey, “no pain no gain!”

Those people will enter our fields and expect that to be the norm, the right way to do things, and they will never question those methods until they too are burnt out. And that’s a damn shame, because we can prevent it. We can teach these newcomers a different lesson- to not make the mistakes that we did. We need to encourage them to come into our industries and change them for the better.

When all is said and done, people will only remember the 4-million-on-day-one sellers, and not the people who worked hard and sacrificed to get the game to that point. We’ll only remember the monster numbers that a powerlifter put up at Worlds, but we’ll never see the training that was put in to achieve that. So let’s take back that part. Let’s do it smarter. Let’s follow the Law of Diminishing Returns.

 

September 13, 08:00 PM

September 14, 2011

Developers building an application on Riak typically have a love/hate relationship with Riak's simple key/value-based approach to storing data. It's great that anyone can grok the basics (3 simple operations, get/put/delete) quickly. It's convenient that you can store anything imaginable as an object's value: an integer, a blob of JSON data, an image, an MP3. And the distributed, scalable, failure-tolerant properties that a key/value storage model enables can be a lifesaver depending on your use case.

But things get much less rosy when faced with the challenge of representing alternate keys, one-to-many relationships, or many-to-many relationships in Riak. Historically, Riak has shifted these responsibilities to the application developer. The developer is forced to either find a way to fit their data into a key/value model, or to adopt a polyglot storage strategy, maintaining data in one system and relationships in another.

This adds complexity and technical risk, as the developer is burdened with writing additional bookkeeping code and/or learning and maintaining multiple systems.

That's why we're so happy about Secondary Indexes. Secondary Indexes are the first step toward solving these challenges, lifting the burden from the backs of developers, and enabling more complex data modeling in Riak. And the best part is that it ships in our 1.0 release, just a few weeks from now.

How Do Secondary Indexes Work?

From an application developer's perspective, Secondary Indexes allow you to tag a Riak object with some index metadata, and later retrieve the object by querying the index, rather than the object's primary key.

For example, let's say you want to store a user object, accessible by username, twitter handle, or email address. You might pick the username as the primary key, while indexing the twitter handle and email address. Below is a curl command to accomplish this through the HTTP interface of a local Riak node:

curl -X POST \
-H 'x-riak-index-twitter_bin: rustyio' \
-H 'x-riak-index-email_bin: rusty@basho.com' \
-d '...user data...' \
http://localhost:8098/buckets/users/keys/rustyk

Previously, there was no simple way to access an object by anything other than the primary key, the username. The developer would be forced to "roll their own indexes." With Secondary Indexes enabled, however, you can easily retrieve the data by querying the user's twitter handle:

 # Query the twitter handle...
 curl localhost:8098/buckets/users/index/twitter_bin/rustyio

 # Response...
 {"keys":["rustyk"]}

Or the user's email address:

 # Query the email address...
 curl localhost:8098/buckets/users/index/email_bin/rusty@basho.com

 # Response...
 {"keys":["rustyk"]}

You can change an object's indexes by simply writing the object again with the updated index information. For example, to add an index on Github handle:

curl -X POST \
-H 'x-riak-index-twitter_bin: rustyio' \
-H 'x-riak-index-email_bin: rusty@basho.com' \
-H 'x-riak-index-github_bin: rustyio' \
-d '...user data...' \
http://localhost:8098/buckets/users/keys/rustyk

That's all there is to it, but that's enough to represent a variety of different relationships within Riak.

Above is an example of assigning an alternate key to an object. But imagine that instead of a twitter_bin field, our object had an employer_bin field that matched the primary key for an object in our employers bucket. We can now look up users by their employer.

Or imagine a role_bin field that matched the primary key for an object in our security_roles bucket. This allows us to look up all users that are assigned to a specific security role in the system.

Design Decisions

Secondary Indexes maintains Riak's distributed, scalable, and failure tolerant nature by avoiding the need for a pre-defined schema, which would be shared state. Indexes are declared on a per-object basis, and the index type (binary or integer) is determined by the field's suffix.

Indexing is real-time and atomic; the results show up in queries immediately after the write operation completes, and all indexing occurs on the partition where the object lives, so the object and its indexes stay in sync. Indexes can be stored and queried via the HTTP interface or the Protocol Buffers interface. Additionally, index results can feed directly into a Map/Reduce operation. And our Enterprise customers will be happy to know that Secondary Indexing plays well with multi data center replication.

Indexes are declared as metadata, rather than an object's value, in order to preserve Riak's view that the value of your object is as an opaque document. An object can have an unlimited number of index fields of any size (dependent upon system resources, of course.) We have stress tested with 1,000 index fields, though we expect most applications won't need nearly that many. Indexes do contribute to the base size of the object, and they also take up their own disk space, but the overhead for each additional index entry is minimal: the vector clock information (and other metadata) is stored in the object, not in the index entry. Additionally, the LevelDB backend (and, likely, most index-capable backends) support prefix-compression, further shrinking index size.

This initial release does have some important limitations. Only single index queries are supported, and only for exact matches or range queries. The result order is undefined, and pagination is not supported. While this offers less in the way of ad-hoc querying than other datastores, it is a solid 80% solution that allows us to focus future energy where users and customers need it most. (Trust me, we have many plans and prototypes of potential features. Building something is easy, building the right thing is harder.)

Behind The Scenes

What is happening behind the scenes? A lot, actually.

At write time, the system pulls the index fields from the incoming object, parses and validates the fields, updates the object with the newly parsed fields, and then continues with the write operation. The replicas of the object are sent to virtual nodes where the object and its indexes are persisted to disk.

At query time, the system first calculates what we call a "covering" set of partitions. The system looks at how many replicas of our data are stored and determines the minimum number of partitions that it must examine to retrieve a full set of results, accounting for any offline nodes. By default, Riak is configured to store 3 replicas of all objects, so the system can generate a full result set if it reads from one-third of the system's partitions, as long as it chooses the right set of partitions. The query is then broadcast to the selected partitions, which read the index data, generate a list of keys, and send them back to the coordinating node.

Storing index data is very different from storing key/value data: in general, any database that stores indexes on a disk would prefer to be able to store the index in a contiguous block and in the desired order--basically getting as near to the final result set as possible. This minimizes disk movement and other work during a query, and provides faster read operations. The challenge is that index values rarely enter the system in the right order, so the database must do some shuffling at write time. Most databases delay this shuffling, they write to disk in a slightly sub-optimal format, then go back and "fix things up" at a later point in time.

None of Riak's existing key/value-oriented backends were a good fit for index data; they all focused on fast key/value access. During the development of Secondary Indexes we explored other options. Coincidentally, the Basho team had already begun work to adapt LevelDB--a low-level storage library from Google--as a storage engine for Riak KV. LevelDB stores data in a defined order, exactly what Secondary Indexes needed, and it is actually versatile enough to manage both the index data AND the object's value. Plus, it is very RAM friendly. You can learn more about LevelDB from this page on Google Code.

Want To Know More?

If you want to learn more about Secondary Indexes, you can read the slides from my talk at OSCON Data 2011: Querying Riak Just Got Easier. Alternatively, you can watch the video.

You can grab a pre-release version of Riak Version 1.0 on the Basho downloads site to try the examples above. Remember to change the storage backend to riak_kv_eleveldb_backend!

Finally keep an eye out for documentation that will land on the newly re-organized Basho Wiki within the next two weeks.

-- Rusty

August 23, 11:49 PM


August 21, 04:27 PM

So, why learn Z80 assembly code, particularly for a very outdated handheld console?

  • It’s a fun way to learn assembly! Rather than programming for some random piece of hardware, you’re making a Gameboy game!
  • It’s inexpensive to purchase the actual hardware. An original Gameboy is $20-$30 on eBay (though Gameboy Colors will also play regular Gameboy games and are even cheaper/easier to find), and a cartridge you can write to via a USB cable is ~$45. With that you get the input, screen, processor, etc. all in one device!
  • Z80 assembly is pretty straightforward. What I mean is, there aren’t that many instructions to keep track of (it’s a CISC architecture), and it’s pretty easy to learn in my opinion.
  • The Gameboy’s variant of Z80 is designed to be used for games. Tile sets, sprites, etc. are handled by the system itself (i.e. it’s built in to the assembly language), so drawing graphics to the screen is fairly painless.
  • The appropriate tools (assembler, linker, etc.) are available on Windows, Linux, and Mac, so development is basically OS-independent. Caveat: Software for writing to some flash carts [writable cartridges] may not be available on every OS.

I recently posted on my personal blog that I’ve been teaching myself to program Z80 assembly for the Gameboy. I already have some assembly programming experience, so the process isn’t too difficult for me; however, others might not, so I figured I’d start up a series on learning assembly code (Gameboy style)! This post will mostly get you set up to get ready to rock, and provide you with a few links to get you going. Tutorials from me will come in future posts!

Note: If at any point you get stuck, feel free to ask me for help! My email is zachary.hoefler@gmail.com. I also recommend the IRC channel #gbdev on EFnet, though it might take a bit to get a reply.

First, you’ll want to set up a development environment. That link will more or less walk you through the process. Basically, you’ll just need to download the RGBDS tool chain and figure out what text editor and emulator you want to use. I highly recommend the BGB Gameboy emulator over nocash (no$gmb). Both will work fine, but BGB is my personal emulator of choice. Both BGB and nocash have debuggers built into them, so they’re really your primary choices for development. (Linux users: BGB also runs fine under Wine!)

At this point, make sure you can compile code properly. This is also covered in the link for setting up a development environment, thankfully! You’ll need to use the command line, but it’s nothing too horrible. In fact, Windows/Linux scripts are available online which will allow you to basically just type “assemble hello-world” and let it automate the assembling, linking, and fixing process. “Fixing” means, basically, that after compiling and linking, a program will run through the output .GB file and make sure a Gameboy will read it as a legitimate game by checking the headers, file size, etc.

Now, we’ve got a development environment set up! Diving in to coding assembly is… well, a bit rough at first. Take a look at this hello world example, with comments. There are a few things you’ll notice:

  • There’s boilerplate code. Every Gameboy ROM has a specific header it has to include in order for the system to read it, and that header goes right into your code. Thankfully, though, the header can be entered using a macro rather than having to enter it every time. Additionally, you have to define handlers for interrupts, start your code at a certain position in the ROM file (that’s what those SECTION assembler commands mean), etc. Try and understand what it’s doing, but don’t expect to memorize it all; you can just look it up later.
  • Each instruction is separated by line breaks. Semicolons denote comments, not the end of an instruction.
  • The instructions like “SECTION,” “INCLUDE,” “DB,” etc. are actually giving instructions to the assembler/linker; they aren’t actual code. These are sort of like C preprocessor commands.
  • There’s a mix of assembly instructions (nop, jp, ld, etc.) and macros (ROM_HEADER, chr_IBMPC1, etc.). The RGBDS assembly has a pretty powerful macro language which is covered very thoroughly in its documentation. They work… well, a lot like C macros. Basically, the code for a macro is inserted wherever the macro is invoked. Additionally, though, you can also do thinks like define a loop using the macro language which lets you have an unrolled loop inside of your code without copying and pasting a line repeatedly.
  • There are labels. These denote sections of code, and work pretty similarly to labels in C. You’ll also use labels to define locations in memory. For example, the ‘Title’ label in the Hello World is the memory address of the “Hello world!” string defined below it.

Sorry for that bombardment of information! However, that should be enough where you can start to identify what a line is when you see it, even if you don’t know exactly what the instruction/macro/etc. is doing just yet.

At this point, a great way to get the hang of programming in assembly is to start modifying code rather than trying to write whole programs from scratch. I recommend checking out these two assignments from Wichita State’s 2008 course on Gameboy assembly programming: Once Upon a Z80, and Hello Gameboy. The first assignment, “Once Upon a Z80,” basically points you to various sections of the Z80 User’s Manual that are worth reading. The second assignment is basically taking the Hello World program I linked above and changing it in some way.

Hopefully that’s enough to get you started! However, if you’re struggling to dive in, don’t worry! My next post is going to be a tutorial starting right at the basics.

Further Reading/Other Resources
http://cratel.wichita.edu/cratel/ECE238Spr08 – This page was from Wichita State’s 2008 course, Assembly Language Programming for Engineers. To make the course more entertaining (and so students could walk away with something cool!), they decided to teach the Gameboy’s Z80 assembly. You can find all of their assignments, downloads, etc. right on the web page. Not only that, all of the students have their work posted on a blog, and there’s a link to their Google Group they used for discussion. Definitely a good resource, particularly if you want a guideline on where to start, what to look up, etc.

Gameboy Dev’rs – Though some of the links are broken by now, GameBoy Dev’rs has a ton of resources on it.

Gameboy Z80 Cribsheet – A really fantastic cheat sheet for Gameboy assembly programming. It’s meant as a reference, so you won’t necessarily be able to learn a lot from it, but it keeps you from having to look stuff up. I highly recommend printing out a copy.

Gameboy Development Wiki – Documentation, tutorials, code examples, hardware information, information on tools like assemblers/compilers/emulators… you can get a lot from this page.

And, if you’re an IRC fan, check out #gbdev on EFnet.

July 23, 04:05 AM

I wanted to share a pair of tricks I picked up a couple of years ago, and while they’re both quite simple, they deserve to be remembered. They’re both a form of low-pass filtering, and while the common usage of low-pass filtering is probably a lot cooler, like controller input smoothing or texture compression, I’ll talk about monitoring and statistics. Oddly enough, there will be a reference to backup-tape schedules in here.

Looking at statistics over time is easy, just store everything on your hard-drive with gigabytes of free space, or upload it to the cloud, or… But what if that’s not an option? Let’s say you think you’re experiencing a slow but steady decrease in frame rate and you want to examine the frame rate history over the last couple of hours. You can’t afford to save that to disk, as that on its own would probably either kill the frame rate, which you’re trying to monitor in the first place, or fill up your memory trying to buffer it all. If you’re on a mobile device, you’re probably in an even worse situation. Not to mention that you’d probably have to start over an run the test again for a couple of hours with the monitoring turned on.

The moving average

One of the most well known filtering techniques is the moving average. The most naive approach is to simply store a number of values and keep averaging. The drawback of this approach is obvious, if you want a very large window, you’ll have to store a lot of values, and the averaging is going to take some time as well. The common solution to this is to use a exponential window, which can be achieved by weighing the previous result with the latest value

The idea here is that the previous results contains information on all previous results, with an ever decreasing factor. This becomes obvious if we expand one step


Expanding further we’ll see that will have a weight of . Here is of course between 0 and 1, otherwise it’d be a strange weight.

Ok, so that works. By having a small , we can increase the influence of previous values and widen the window we’re averaging over. Yes, theoretically old values will never cease to influence the result, but in practice, the weight will eventually cause it to be insignificant (and at some point it’ll fall victim to the limited precision of our computers).

Load average

Let’s say we’re averaging frame times using the moving average above, we’ll calculate a new value every frame. This will give you a value, and it’ll most likely be good enough for whatever you’re testing, but it will probably not be strictly what you think it is.

We’re adding frame times every frame, and displaying the inverse to get a frame rate. The important thing to note here, is that the averaging window is defined in terms of number of frames, rather than in seconds. So, if we have a window of 60 frames, at 60 fps it’ll be a one second window, but at 6 fps, it’ll actually be a 10 second window, tricking our mind.

What our mind prefers to work with is a fixed time window. There are two ways to go about this, you either calculate which coefficient we want to use and add every frame to the result as before. This usually ends up evaluating some exponential function, which isn’t much fun. :) Or, you start counting frames and make sure you reset the counter at fixed intervals and adding that value to the result. Incidentally, you’re now counting frames per second, rather than seconds per frame, funny how that works out sometimes. This is actually how the Linux (and probably other *nix variants as well) calculates load average, with the notable addition that it’s doing it using fixed point arithmetic, I seriously suggest checking it out, as it is simple yet quite brilliant.

Round robin databases

This technique dawned on me while I was working as a system administrator several years ago. It was actually quite ingenious, we had four ‘week’ tapes, one for each weekend of the month, 11 month-tapes for each first weekend of each month from February to December. In January, we’d use a Year-tape (rather than a month tape). The thing is, at any given time, we’d have one backup a week for the last four weeks, one backup a month for the last year, and one backup a year after that. You don’t need to “filter” the backups the same way you do numbers, as any given backup contains all changes made to the file system up to that point, but the same concept holds.

How it works with numbers

If you haven’t seen it before, check out RRDTool or the corresponding Wikipedia page. A round robin database is basically a set of circular buffers, one for each level or ‘decade’, and when you need to reclaim previously written data, you take a whole chunk of it, aggregate it, and insert it into the level/decade above. For each level, the stored value corresponds to a larger time span, which means that in the near past, you’ll have the highest precision, and as you look further back, the data will become more and more blurred. This means you cannot go back and examine your data up close, but the resolution will hopefully be enough to see how your frame rate steadily dropped over the last hour.

So why do it this way? Well, let’s assume we count frames per 100 ms, we store 32 values on each level (which don’t need to be the same size of course and you may tweak it to your hearts content, but I’ll use the same size here), and you combine four values whenever you aggregate. In the first level, we’ll have 3.2 seconds worth of 100ms counts, in the second level we’ll have 12.8 seconds worth of 400ms counts, in the third we’ll have 51.2 seconds worth of 1.6s counts. At the sixth level we have almost an entire hours worth of history, by storing a mere 192 values, compared to the 36 000 values required to store it in raw, and you still have the last couple of seconds at full resolution. Granted, the resolution at the sixth level is 102.4 seconds, but if you’re graphing the entire hour on screen, that might not be so bad (especially on a mobile device). :) At 102.4 seconds, there might also be plenty of time to queue it to be written to storage as well, without much penalty.

Aggregation

You need to pay attention to what you’re measuring. If you’re tracking a rate (like frames or packets per second), you’ll need to average the values. If you’re only tracking a count, you’ll need to sum them up. If you’re tracking the worst frame rate, you’ll need simply grab the largest value, if you’re tracking the best frame rate, you take the smallest. You see where I’m getting at, there’s more to it than simply storing the history, using this technique you can store the average, along with best and worst values, and still only use a mere fraction of what was required if you were to store the raw data. It might even be worth having it activated at all times, so you don’t have to rerun that hour long test to get the results.

Wrap-up

So what am I using this for? This and that actually, usually outside game development, but the reason I brought it up is that I find it rewarding to see player progression over time, and I wanted to have it built into this little gadget I’m working on, where I don’t have the luxury of being able to store massive amounts on disk (mobile device) or online (low budget).

Note: Cross posted on my own blog, excu.se

July 14, 02:00 PM

In the age of Google and Wikipedia, an almost unlimited amount of information is available at our fingertips, and with the rise of smartphones, many of us have nonstop access. The potential to find almost any piece of information in seconds is beneficial, but is this ability actually negatively impacting our memory? The authors of a paper that is being released by Science Express describe four experiments testing this. Based on their results, people are recalling information less, and instead can remember where to find the information they have forgotten.

Read the comments on this post

July 09, 10:00 AM

Have you ever played a game like Starcraft or Supreme Commander and gotten an error message that says “Desync Detected” followed by the game closing? Do you wonder what that means? It stems from certain engine architectures commonly used by RTS games1.

My experience in this area comes from working with the Supreme Commander engine at Gas Powered Games. Starcraft and Warcraft 3 have had desync bugs during beta periods so it’s safe to say they work in a similar manner. For simplicity’s sake I’m going to discuss the SupCom engine specifically from this point forward. Finding similarities between the SupCom engine and other games shall be left as an exercise for the reader. :)

Requirements

First things first, what are the requirements for our game? To help give an idea, here’s the announcement trailer for Supreme Commander 1 (2006).

It must support 8 player multiplayer, on the internet, with hundreds of units per army. That’s thousands of units in a single game. Holy crap. The typical FPS client-server architecture will clearly not work. With so many units it would require multiple orders of magnitude more bandwidth than is acceptable.

How can we accomplish this feat?

Synchronous Engine Architecture

With a fully synchronous lockstep architecture! In a synchronous engine every client executes the exact same code at the exact same frame rate. Let that sink in for a moment. In an 8 player game of SupCom every player has an identical copy of the game state and follows an identical code path. Instead of transferring over per unit state information (position, health, etc) over the network only player input needs to be sent across the networks2. If all players have an identical game state and process the same input then their output state should also be identical.

It’s the same principle as instant replays in many games, including shooters. Have you ever wondered why file sizes of instant replays are so tiny? It’s because the replay file only needs to store player inputs. Simply re-run the game feeding the inputs from the replay file and you’ll get the exact same result. This is also why replays stop working3 when the game is patched and why you frequently can’t rewind them4. It’s also why some RTS games don’t allow join in progress. For a player to join mid-game the entire game state would have to be sync’d. If the game has 3000 units that’s just too much.

Layers

Take a look at the video if you haven’t already. What frame rate do you think the game is running at? The correct answer is 10 frames per second. Wait, what? It looks far smoother than 10 fps you say! It is and it isn’t. The game is actually running at two separate frame rates.

The SupCom engine uses two layers – simulation and user. The simulation layer runs at a fixed 10 fps all the time. This is what you could consider the “real game”. All units, all AI, and all physics are updating within a SimTick function running at a mere 10 hz. Each SimTick needs to run within 100ms or the game will play in slow motion. In a multiplayer game if one player is unable to fully process the SimTick in 100ms then all other players can become held up and have to wait wait.

The user layer runs at full frame rate. This layer can be thought of as strictly visual. User interface, rendering, animation, and even unit position can run at a silky smooth 60fps. Each UserTick updates at a variable time delta which is used to interpolate the game state (such as unit positions). This is why the game can look and feel smooth when it’s secretly slow in the background.

Determinism

Hold on a moment the clever readers cried! If each player is independently updating the game state does that mean the game simulation must be fully deterministic? It sure does. Isn’t that hard? Yep. It’s even more difficult in the modern world of multi-threading.

A lot of pain in creating a deterministic game stems from floating point numbers. Rather than cover this topic in-depth I refer readers to Glenn Fiedler’s fantastic post on the subject - Floating Point Determinism.

In the comments Elijah specifically discusses Supreme Commander. Setting the cpu to strictly follow the IEEE754 standard gets the job done. It comes with a performance cost and the game can never perform an operation with an undefined result, but you shouldn’t be doing that anyways now should you?

Inherent Latency

There are some distinct downsides to a synchronous multiplayer game. Aside from the complexity of creating a massive deterministic simulation there is also some required latency on input. I went over how each user in a multiplayer game is updating an identical game state and they only need to process input. This means for any new piece of input it can’t be processed until all clients agree on which tick to process it!

For example three players – player A, B, and C – are all running SimTick 1. During this time Player A issues an attack command. The UI instantly flashes in response as UserTick updates at 60hz. In a single player game this attack command would be processed in SimTick 2 (0-100ms latency). However all three players must execute the command during the same SimTick to get the same results. Instead of attempting to process the command on SimTick 2 Player A sends a network packet to Player B and Player C with data to execute on SimTick4 (200-300ms latency). This gives enough time for all players to receive the command. The game may be forced to stall if input information is not received and/or acknowledged in some form. I don’t know what that mechanism was exactly for SupCom, but I’ll update this post if I can find out. The exact number of SimTicks into the future to execute a command can be dynamically determined based on the peer-to-peer topology5.

The latency from player click to unit response is always going to be at least 0-100ms (the next SimTick). This can be masked in a few ways. The UI response, usually something flashing, is immediate. There is frequently an audio response as well. “My life for Aiur!” “Zug Zug”

In a single-player game this is fine, but in multiplayer it can become noticeable as the delay is likely several hundred milliseconds. I always wanted to experiment with immediate UserTick animation responses. For example if you issue a move command the unit could start slowly moving in the user layer immediately and then blend into it’s “true” sim layer location when the command actually executes. This would be extra helpful to more twitchy games such as Demigod or DOTA. There are some pretty ugly edge cases to handle though so I’ve never had the chance. I’d love to hear what other folks have done in the comments. :)

Desyncs – The Bugs from Hell

One of the most vile bugs in the universe is the desync bug. They’re mean sons of bitches. The grand assumption to the entire engine architecture is all players being fully synchronous. What happens if they aren’t? What if the simulations diverge? Chaos. Anger. Suffering.6

In SupCom the entire game state is hashed once per second. If any clients disagree on the hash that’s it. Game over. The end. An error box pops up that says “Desync Detected” and you have to quit. Something in their SimTicks varied and now the games are different. They have diverged and they will only get further apart from this point on. There is no recovery mechanism.

Desyncs are usually programmer error. A desync may repro 5% in games lasting more than 60 minutes. Fixing a desync generally involves a binary search of printf-ing the current memory hash as the state is walked. On low repro-rate desyncs this usually leads to a massive spamming of the hash while a half dozen machines loop the game as fast as they can waiting for it to break. Adding insult to injury, one of the most common cases is an uninitialized variable.

A Demigod Tale

A lot of my work with the SupCom engine was actually on Demigod, which used a modified version of the engine.

Near the end of development there was a long-standing, but highly infrequent desync, that was handed to me. In Demigod there are dozens of tiny fodder that run across the map. On extremely rare occasion the location of a fodder would vary by a few centimeters on different machines. Sounds innocuous but like the wings of a butterfly a hurricane of woe can ensue.

I distinctly recall not being sure I could fix it and my lead saying “I’m trusting you to get this fixed. I know you will.” Geez, no pressure right? Every morning we had a 10am stand up and every day my response was the same – “desync hunting.” After almost a week of spiraling into madness I found the issue. If you watched the trailer you’ll notice some hero abilities that knock fodder into the air. When the giant walking castle smashes his hammer down all the fodder go flying. The bug was a pointer to a steering based pathfinding component that dangled until the fodder crashed into the ground and disappeared.

For a desync to occur it wasn’t just that simple. First the fodder had to be killed by one of only a few special abilities. This deleted the pathing component and dangled a pointer. With our memory allocator the deleted component’s memory block was simply moved to a free list, but otherwise unchanged. Then, before the fodder landed, a new memory allocation needed to occur. That allocation needed to be handed the same memory block used by our just deleted memory component. Then and only then would a desync occur. Appropriately setting the pointer to NULL fixed the issue.

Final Thoughts

This has been a very brief overview of a synchronous engine as used by Supreme Commander. I know many older games have worked in similar manner. The latest generation may be much fancier, particularly in terms of handling input latency. I know that Starcraft 2 can desync so it’s likely similar. Other games to look at would be Heroes of Newarth or League of Legends. They aren’t nearly as massive as SupCom and feel highly responsive but I haven’t investigated in depth to see what clever tricks they pull.

If anyone has a good desync war story please share in the comments. :)

Footnotes

  1. Halo actually uses a synchronous lockstep model for Campaign co-op and Firefight.
  2. In SupCom input is handled as commands to groups of units. Commands to move, attack, defend, use ability, etc.
  3. Old replays can be supported if you can run the old game code and data.
  4. Rewinding was accomplished in Halo replays by storing “save points” which store the game state. You can’t smoothly rewind, but you can jump to any previous save points and play forward. I think.
  5. SupCom uses a fully peer-to-peer networking system.
  6. But no force lightning unfortunately.

 

April 28, 05:17 PM

I decided to go with a bit of a different format today. I’m not sure what gave me the idea to make an infographic but I was making a lot of images for the post anyway and I thought “Hey, why not?”  Hopefully that image mostly speaks for itself but in case it doesn’t here’s a quick rundown of the process:

In order to know which graphical tile we need to render for a particular space we need to somehow take a look at the spaces around it.  Now this could be accomplished by lots of if or case statements but using a little math it’s much easier to handle.  We’re using a 4 bit number to create a hash for each space that identifies what the tiles around it look like. If the space has a value of 14 that means there is no tile above it but one to it’s left, one to it’s right and one below it. This will always be the case for a tile of value 14, and all tiles in this situation will have a value of 14. Once we have this value, you know which graphic tile it needs and if you designed your graphics well it will all fit together for any combination. The graphic set I made here is just designed to illustrate which of the 4 sides have a connection and which don’t.

Moving to 8 Bit

This is great for some cases but it can be graphically kind of limiting.  What happens if you care about the corners? Well that’s what I had to deal with for my digging game.  The graphics for this game are actually inverted from how I drew them on the info graphic because you’re cutting out tunnels from otherwise solid land.  Luckily we can just use 8 bits (4 for NSEW and 4 for the corners) and have every case available! Now for those of you good at binary math you might have realized that would lead to 256 different combinations! That’s way too many graphics to make but we can cut it down to 48 unique tiles because not every situation cares about all the corners.  In fact if you just draw some examples out for yourself you’ll notice that we only care about a corner with both sides adjacent to the corner are empty.  Take a look at this example:

Here you can see how in the first and second pictures you can see how the lower right corner matters for the center piece’s shape. On the other hand between the second and third picture you can see how the other corners don’t affect the shape of the center piece at all.  That simplifies from 256 case to just the 48 cases you can see here in my tilemap: (Plus some other tiles for various other things)

There’s still two problems.
1. I made the tilemap art in the way that was easiest from the art side, not the coding side or the order is all over the place.
2. We need to somehow map the 256 possibilities to the correct 48 graphic values.

Both of these problems can be dealt with at the same time by using a look-up table.  The job of this look-up table is to map from 0-255 -> 0-47 depending conditionally on the corner pieces. You can implement this however you want, probably an array where the indexes 0-255 correspond to the result of the bitmasking and it spits out the graphic tile value 0-47.  So, first lets talk about the bitmask we’re going to use for the 8-bit case.

The way I’ve broken it up is so if you just look at the non-corner bits (4 most significant) you’ll get a base case, then you add the corner case to it and you end up with a unique value. This makes it easier to come up with the look-up table because the non-corner cases are largely what defines the tile’s shape and they will be next to each other.  Next I made a table which defined how all 256 values translate:

You can see how some bits have an X instead of 0 or 1. These are the corners that don’t matter for that particular base case.  The “Lower Bits Sum” column shows all values that row accounts for when you do all combinations of 0 and 1 for any x’s.  Hopefully this table makes sense, the first 4 numbers (including x’s) give the 0-256 value that results from bitmasking and the final value column is the index on the tileset graphic I posted above.  You can see they are all out of order because like I said I made them before really thinking about this.

Hopefully this helps people if you need to use something like this in the future. It’s not too complicated once you lay it all out and it’s much easier and cleaner than having all kinds of branching and crazy logic in your code.

Beta testing

The mole digging game I’m making is almost ready to have people start beta testing. It’s very much a prototype at the moment, the core functions are mostly in there but there’s no real aspects to make it a game at this point, I suppose it’s a mole digging simulation.  I want to start getting feedback early and try to be really iterative in the game design process so if you want to help follow this link to sign up on testflight:

(Edit: I took the link out because I’m pretty sure bots were spamming it. If you’re interested e-mail me)

June 06, 12:00 AM
June 03, 04:06 AM

At the risk of getting punched in the face by my friend Miguel, I’m not afraid to admit I’m a fan of responsible use of dependency injection. However, for many folks, attempting to use DI runs into a roadblock when it comes to ASP.NET HttpModule.

In the past, I typically used “Poor man’s DI” for this. I wasn’t raised in an affluent family, so I guess I don’t have as much of a problem with this approach that others do.

However, when the opportunity for something better comes along, I’ll take it Daddy Warbucks. I was refactoring some code in Subtext when it occurred to me that the new ability to register HttpModules dynamically using the PreApplicationStartMethodAttribute could come in very handy.

Unfortunately, the API only allows for registering a module by type, which means the module requires a default constructor. However, as with many problems in computer science, the solution is another layer of redirection.

In this case, I wrote a container HttpModule that itself calls into the  the DependencyResolver feature of ASP.NET MVC 3 in order to find and initialize the http modules registered via your IoC/DI container. The approach I took happens to be very much similar to one that Mauricio Scheffer blogged about a while ago.

using System;
using System.Collections.Generic;
using System.Web;
using System.Web.Mvc;
using HttpModuleMagic;
using Microsoft.Web.Infrastructure.DynamicModuleHelper;

[assembly: PreApplicationStartMethod(typeof(ContainerHttpModule), "Start")]
namespace HttpModuleMagic
{
  public class ContainerHttpModule : IHttpModule
  {
    public static void Start()
    {
      DynamicModuleUtility.RegisterModule(typeof(ContainerHttpModule));
    }

    Lazy<IEnumerable<IHttpModule>> _modules 
      = new Lazy<IEnumerable<IHttpModule>>(RetrieveModules);

    private static IEnumerable<IHttpModule> RetrieveModules()
    {
      return DependencyResolver.Current.GetServices<IHttpModule>();
    }

    public void Dispose()
    {
      var modules = _modules.Value;
      foreach (var module in modules)
      {
        var disposableModule = module as IDisposable;
        if (disposableModule != null)
        {
          disposableModule.Dispose();
        }
      }
    }

    public void Init(HttpApplication context)
    {
      var modules = _modules.Value;
      foreach (var module in modules)
      {
        module.Init(context);
      }
    }
  }
}

The code is pretty straightforward, though there’s a lot going on here. At the top of the class we use the PreApplicationStartMethodAttribute which allows the http module to register itself! Just reference the assembly containing this code and you’re all set to go. No mucking around with web.config!

Note that this code does require that you’re application has the following two assemblies in bin:

  1. System.Web.Mvc.dll 3.0
  2. Microsoft.Web.Infrastructure.dll 1.0

The nice part about this is after referencing this assembly, I can simply register the Http Modules using my favorite DI container and I’m good to go. For example, I installed the Ninject.Mvc3 package and added the following Subtext http module bindings:

kernel.Bind<IHttpModule>().To<BlogRequestModule>();
kernel.Bind<IHttpModule>().To<FormToBasicAuthenticationModule>();
kernel.Bind<IHttpModule>().To<AuthenticationModule>();
kernel.Bind<IHttpModule>().To<InstallationCheckModule>();
kernel.Bind<IHttpModule>().To<CompressionModule>();

There is one caveat I should point out. You’ll notice that when the container http module is disposed, Dispose is called on each of the registered http modules.

This could be problematic if you happen to register them in singleton scope. In my case, all of my modules are stateless and the Dispose method is a no-op, which in general is a good idea unless you absolutely need to hold onto state.

If your modules do hold onto state and need to be disposed of, you’ll have to be careful to scope your http modules appropriately. It’s possible for multiple instances of your http module to be created in an ASP.NET application.

DI for a Single Http Module

Just in case your DI container doesn’t support the ability to register multiple instances of a type (in other words, it doesn’t support the DependencyResolver.GetServices call), or it can’t handle the scoping properly and your http module holds onto state that needs to be disposed at the right time, I did write another class for registering an individual module, while still allowing your DI container to hook into creation of that one module.

In this case, you won’t be using DI to register the set of http modules. But you will be using it to create instances of the modules that you register.

Here’s the class.

using System;
using System.Web;
using System.Web.Mvc;

namespace HttpModuleMagic
{
  public class ContainerHttpModule<TModule> 
    : IHttpModule where TModule : IHttpModule
  {
    Lazy<IHttpModule> _module = new Lazy<IHttpModule>(RetrieveModule);

    private static IHttpModule RetrieveModule()
    {
      return DependencyResolver.Current.GetService<IHttpModule>();
    }

    public void Dispose()
    {
      _module.Value.Dispose();
    }

    public void Init(HttpApplication context)
    {
      _module.Value.Init(context);
    }
  }
}

This module is much like the other container one, but it only wraps a single http module. You would register it like so:

DynamicModuleUtility.RegisterModule(typeof(ContainerHttpModule<MyHttpModule>));

In this case, you’d need to set up your own PreApplicationStartMethod attribute or use the WebActivator.

And of course, I created a little NuGet package for this.

Install-Package HttpModuleMagic

Note that this requires that you install it into an application with the ASP.NET MVC 3 assemblies.

February 24, 05:49 AM

Our C# compiler-as-a-service library can now process any C# construct, it is no longer limited to expressions and statements.

This means, that you can now enter entire class definitions in the command line:

csharp> class Demo {
      >     public int Add (int a, int b)
      >     {
      >          return a + b;
      >     }
      > }
csharp> new Demo ().Add (1, 3);
4
csharp>

This work was done by the amazing Marek and is now available on Mono's master branch in github.

This functionality can also be used for scripts, in particular in Unix, you can now create C# "source executable" files, like this:

bash$ cat demo.cs
#!/usr/bin/csharp
class Demo {
	public dynamic Add (dynamic a, dynamic b)
	{
		return a + b;
	}
}
Console.WriteLine (new Demo ().Add ("this is", " cute"));
bash$ chmod +x demo.cs
bash$ ./demo.cs
this is cute
bash$

Multiple Compiler Instances

In addition, we turned the static API Evalutor.Eval (string expression), into an instance API. The instance API allows developers that are embedding the C# compiler-as-a-service in their application to have different contexts.

This required the entire compiler to be updated from being a giant set of static classes that could safely use global variables and state into a state that was properly encapsulated.

The API is now richer, we provide a way to configure the compiler settings using a settings class. This can be populated either manually, or by using the build-in command-line parser for compiler options. The following sample shows how this could be used:

using Mono.CSharp;
using System;

class Runner {
	static int Main (string [] args)
	{
		var r = new Report (new ConsoleReportPrinter ());
		var cmd = new CommandLineParser (r);
		
		var settings = cmd.ParseArguments (args);
		if (settings == null || r.Errors > 0)
			Environment.Exit (1);

		var evaluator = new Evaluator (settings, r);

		evaluator.Run ("class Demo { public static int Add (int a, int b) { return a+b; }}");
		evaluator.Run ("print (Demo.Add (1,2));");
		return 0;
	}
}

Testers Wanted

This revamped compiler will be part of Mono 2.12, but we would love to get users to test the new functionality and to help us identify any problems early on, before we even release this code.

We do provide a convenient sln file that you can use the compiler as a service, and it works both in Visual Studio/.NET and Mono.

Silverlight

We have not tested this with Silverlight, but in theory, it should now work fine with it. We would love to see someone build an interactive shell like the one we did with Gtk# but hosted on the browser:

February 23, 10:15 AM

It's no big surprise to find out that Vladimir Putin's Russia doesn't much appreciate protests against authoritarian regimes. It's more surprising to find American TV talk show host Glenn Beck saying the same thing. And gets twice as weird when both men name Google as a potent force behind the protests currently sweeping the Middle East.

Putin's Deputy Prime Minister Igor Sechin told the Wall Street Journal recently, "Look what they have done in Egypt, those highly placed managers of Google, what manipulations of the energy of the people took place there," referring to Google execs like Wael Ghonim of Egypt.

Beck, meanwhile, has been promoting dark theories about Google's cooperation with the US government. Last week, he told fellow Fox News host Bill O'Reilly that "there are four or five executives that are also in bed, literally—not literally—but in bed or in the office with the president and working with the White House… Google, in their own words, Google, two vice presidents of Google actually helped foment revolution in Egypt, and they're proud of it."

On his own show, Beck later recommended that his viewers not use Google. "May I recommend, if you’re doing your own homework, don’t do a Google search. Seems to me that Google is pretty deeply in bed with the government. Maybe this is explaining why Google is being kicked out of all the other countries? Are they just a shill now for the United States government? Who is [Google exec] Jared Cohen? Is he private citizen or government operative? And isn’t this the second Google guy we’ve found [after Wael Ghonim]? This is the second Google executive now being exposed as an instigator of a revolution. Are you comfortable with the government partnering covertly with media organizations, search engines, and social networking so they can bring change that the Washington elites have designed?"

Who knew that Google was such a force for popular democratic revolutions—and that this was a bad thing?

Read the comments on this post

February 23, 01:17 PM

One day, you’ll be telling your grandchildren about getting a programming job, version 1.0. You would send a “resume” to a “recruiter.” It included all kinds of silly information required by the esoteric resume ritual (foreign languages spoken, whether or not you play ultimate Frisbee, Microsoft-veteran status). This so-called “information” was utterly useless at determining whether you could program or not, but if you spelled everything right and used suitable fonts, you could come in for a day of interviews at which you would be asked to perform mundane programming tasks on a whiteboard.

Careers 2.0 is here!

Need to hire a really great programmer? Want a job that doesn't drive you crazy? Visit the Joel on Software Job Board: Great software jobs, great people.

April 27, 08:35 AM

A circular array based lock-free queue with no memory allocation on the heap and no ABA problem

January 26, 12:00 AM
February 05, 11:46 AM
Introducing ConcurrentHashTrie

While spending some time looking at Clojure's concurrency model, I did a little bit of research into their persistent collection implementation that uses Bagwell's Hash Array Mapped Tries.  After a little bit of thought it occurred to me that you could apply the persistent collection model to an implementation of ConcurrentMap.  The simple solution would be to maintain a reference to a persistent map within a single AtomicReference and apply a compareAndSet (CAS) operation each time an update to the map was performed.  In Clojure terms, a ref and swap! operation.  However a typical use of ConcurrentMap is for a simple cache in a web application scenario.  With most web applications running with fairly significant numbers of threads (e.g. 100's) it's conceivable that the level of contention on a single compare and set operation would cause issues.  Non-blocking concurrent operations like compare and set work really well at avoiding expensive user/kernel space transitions, but tend to break down when the number of mutator threads exceeds the number of available cores*.  In fact thread contention on any concurrent access control structure is a killer for performance.  The java.util.ConcurrentHashMap uses lock-striping as a mechanism to reduce contention on writes.  If absolutely consistency across the ConcurrentMap was sacrificed (more information below), then a form of "CAS striping" could be applied to the ConcurrentHashTrie to reduce contention.  This is implemented by replacing the AtomicReference with an AtomicReferenceArray and using a hash function to index into the array.

The Implementation

There are a number of implementations of the Bagwell Trie in JVM languages, however I couldn't find an implementation in plain old Java, so I wrote one myself.  For the persistent collection it follows a fairly standard polymorphic tree design.

The core type of Node has 3 implementations.  The key one is the BitMappedNode, which handles the vast majority of branch nodes and implements the funky hash code partition and population count operations that make up the Bagwell Trie structure.  LeafNode holds the actual key/value pairs and ListNode is there to handle full collisions.  The key mutator methods of ConcurrentMap: put, putIfAbsent, remove and replace are implemented using CAS operations on the rootTable, which is an AtomicReferenceArray.

As mentioned earlier, the ConcurrentHashTrie is not consistent for operations that occur across the entire structure.  While at first look this seems terrible, in practical terms it is not really an issue.  The 2 main operations this impacts are size and iteration.  The size operation in the ConcurrentHashTrie is not O(1), but is O(n) where n is the size of the rootTable (or the number of stripes).  Because it has to iterate across all of the nodes in the rootTable (it doesn't need to traverse down the trees) and add all of their sizes it possible for the size of a Node to change after it has been read but before the result has been calculated.  Anyone that has worked a concurrent structure before has probably found that the size value is basically useless.  It is only ever indicative, as the size could change right after the method call returns, so locking the entire structure while calculating the size doesn't bring any benefit.  Iteration follows the same pattern.  It is possible that a node could have changed after being iterated over or just before being reached.  However, it doesn't really matter as it is not really any different to the map changing just before iteration started or just after it completes (as long as iteration doesn't break part way through).  Note that Cliff Click's NonBlockingHashMap exhibits similar behaviour during iteration and size operations.

Performance, The Good, The Bad and the... well mostly Bad

Cliff Click kindly included a performance testing tool with his high scale library which I've shamelessly ripped off and used to benchmark my implementation.  Apparently he borrowed some code from Doug Lea to implement it.  I changed a sum total of 1 line.  Writing benchmarks, especially for concurrent code, is very tough (probably harder than writing the collection itself), so borrowing from the experts gives me some confidence that the numbers I produce will be useful.

So onto the results:




Do'h!!

Quite a bit slower than the java.util.ConcurrentHashMap.  I didn't even bother comparing to the NonBlockingHashMap from the high scale library, the numbers would be too embarrassing.

The ConcurrentHashTrie2 is an alternative implementation that I experimented with.  I suspected the polymorphic behaviour of the tree nodes was causing a significant performance hit due to a high number of v-table calls.  The alternate implementation avoids the v-table overhead by packing all three behaviours into a single class (in a slight dirty fashion).  I stole a couple of bits from the level variable to store a discriminator value.  The Node class switches on the discriminator to determine the appropriate behaviour.  As the results show, it didn't help much.  I ran a number of other permutations and the results were largely the same.

Conclusion

So a question remains, is the approach fundamentally flawed or is my code just crap?  I suspect the major cost is caused by heavy amount of memory allocation, copying and churn through the CPU cache caused by the path-copy nature of mutation operations.  Also the read path isn't particularly quick.  With a reasonably full map, reads will likely need to travels a couple of levels down the tree.  With the traditional map, its a single hash and index operation.

* I really need a proper citation for this.  However imagine 16 cores and a 100 threads trying to apply a compare and set operation to the same reference.  Unlike a lock which will park the threads that fail to acquire the lock, non-blocking algorithms require the thread that fails to apply its change to discard its result and recompute its result, effectively spinning until is succeeds.  With more CPU bound threads than cores, its possible that the system will end up thrashing.
October 08, 02:21 PM

First off, for those that don?t know what Manos is: Manos is an easy to use, easy to test, high performance web application framework that runs on Mono. You can read more about it here.

Lately it seems like everyone is trying to install Manos on OS X, so I spent some time last night trying to install it on my macbook.

I think I?ve fixed most of the issues so its a relatively smooth process now. There are a few more steps than I?d like, but remember this is just alpha software.

So here we go.

Install Mono 2.8

Grab the Mono 2.8 OSX package from the Mono Downloads Page.

You need to have Mono 2.8 installed on your system. An older Mono install wont cut it. Also, if you?ve install Mono from source on your Mac, things might work, things might not work. This guide assumes you have it installed from packages.

Install libev

libev is Manos?s one native dependency. I installed this guy using macports:

sudo port install libev +universal

The key part of this is that we are installing the universal build of libev. If you leave that part out you could get a 64bit version and Mono won?t be able to load it.

You should now have a libev.dylib in /opt/local/lib/ to make sure Mono knows where to find that library, update your DYLD_FALLBACK_LIBRARY_PATH.

export DYLD_FALLBACK_LIBRARY_PATH=/opt/local/lib

Install libev-sharp

libev-sharp is a managed wrapper around libev. The best way to install it is to grab it from my github:

git clone https://jacksonh@github.com/jacksonh/libev-sharp.git
cd libev-sharp
./configue
make
sudo make install

This will install libev-sharp.dll into your /usr/local/lib/libev-sharp directory and will also install a pkg-config file.

Install Manos

Now that all the dependencies are installed you should be able to build and install Manos.

git clone https://jacksonh@github.com/jacksonh/manos.git
cd manos
./configue
make
sudo make install

This will install Manos.dll, manos.exe and libev-sharp.dll into /usr/local/lib/manos. As well as a .pc file and a manos script for invoking manos.exe

Confirm your installation

You should now be able to run the manos documentation server:

manos -docs

and navigate to http://localhost:8181/ in your browser.

Trouble Shooting

  1. If your build fails because libev-sharp isn?t found, you can manually copy it into your manos/build/ directory.

    erp:manos jackson$ cp /usr/local/lib/libev-sharp.dll* build/.

Note that you want to cp libev-sharp.dll* not just libev-sharp.dll. That way you?ll get the .mdb debugging file copied over also.

  1. If you are getting a type load exception, make sure you have the universal libev installed:

    erp:~ jackson$ cd /opt/local/lib

    erp:lib jackson$ file libev.dylib

    libev.dylib: symbolic link to libev.3.0.0.dylib
    

    erp:lib jackson$ file libev.3.0.0.dylib

    libev.3.0.0.dylib: Mach-O universal binary with 2 architectures
    
    libev.3.0.0.dylib (for architecture x86_64):    Mach-O 64-bit dynamically linked shared library x86_64
    
    libev.3.0.0.dylib (for architecture i386):      Mach-O dynamically linked shared library i386
    

Another trick is to use Mono?s logging to see where Mono is looking for libev.dylib

 MONO_LOG_MASK=dll MONO_LOG_LEVEL=debug manos -docs

Hopefully all this works for you, if you notice any problems please let me know. Its my first pass at getting Manos running on OS X so its very possible I?ve missed something.

And remember there is a Manos de Mono Google Group

And you can follow the github project here: http://github.com/jacksonh/manos

Special thanks to Geoff Norton for answering all my ?i dont get the mac? questions.

December 21, 06:11 AM

  

For many beginners, the task of picking fonts is a mystifying process. There seem to be endless choices — from normal, conventional-looking fonts to novelty candy cane fonts and bunny fonts — with no way of understanding the options, only never-ending lists of categories and recommendations. Selecting the right typeface is a mixture of firm rules and loose intuition, and takes years of experience to develop a feeling for. Here are five guidelines for picking and using fonts that I’ve developed in the course of using and teaching typography.

1. Dress For The Occasion

Many of my beginning students go about picking a font as though they were searching for new music to listen to: they assess the personality of each face and look for something unique and distinctive that expresses their particular aesthetic taste, perspective and personal history. This approach is problematic, because it places too much importance on individuality.


The most appropriate analogy for picking type. (Photo credit: Samuuraijohnny. Used under Creative Commons license.)

For better or for worse, picking a typeface is more like getting dressed in the morning. Just as with clothing, there’s a distinction between typefaces that are expressive and stylish versus those that are useful and appropriate to many situations, and our job is to try to find the right balance for the occasion. While appropriateness isn’t a sexy concept, it’s the acid test that should guide our choice of font.

My “favorite” piece of clothing is probably an outlandish pair of 70s flare bellbottoms that I bought at a thrift store, but the reality is that these don’t make it out of my closet very often outside of Halloween. Every designer has a few favorite fonts like this — expressive personal favorites that we hold onto and wait for the perfect festive occasion to use. More often, I find myself putting on the same old pair of Levis morning after morning. It’s not that I like these better than my cherished flares, exactly… I just seem to wind up wearing them most of the time.

Every designer has a few workhorse typefaces that are like comfortable jeans: they go with everything, they seem to adapt to their surroundings and become more relaxed or more formal as the occasion calls for, and they just seem to come out of the closet day after day. Usually, these are faces that have a number of weights (Light, Regular, Bold, etc) and/or cuts (Italic, Condensed, etc). My particular safety blankets are: Myriad, Gotham, DIN,Akzidenz Grotesk and Interstate among the sans; Mercury, Electra and Perpetua among the serif faces.


A large type family like Helvetica Neue can be used to express a range of voices and emotions. Versatile and comfortable to work with, these faces are like a favorite pair of jeans for designers.

2. Know Your Families: Grouping Fonts

The clothing analogy gives us a good idea of what kind of closet we need to put together. The next challenge is to develop some kind of structure by which we can mentally categorize the different typefaces we run across.

Typefaces can be divided and subdivided into dozens of categories (Scotch Modern, anybody?), but we only really need to keep track of five groups to establish a working understanding of the majority of type being used in the present-day landscape.

The following list is not meant as a comprehensive classification of each and every category of type (there are plenty of great sites on the web that already tackle this, such as Typedia’s type classifications) but rather as a manageable shorthand overview of key groups. Let’s look at two major groups without serifs (serifs being the little feet at the ends of the letterforms), two with serifs, and one outlier (with big, boxey feet).

1. Geometric Sans

I’m actually combining three different groups here (Geometric, Realist and Grotesk), but there is enough in common between these groups that we can think of them as one entity for now. Geometric Sans-Serifs are those faces that are based on strict geometric forms. The individual letter forms of a Geometric Sans often have strokes that are all the same width and frequently evidence a kind of “less is more” minimalism in their design.

At their best, Geometric Sans are clear, objective, modern, universal; at their worst, cold, impersonal, boring. A classic Geometric Sans is like a beautifully designed airport: it’s impressive, modern and useful, but we have to think twice about whether or not we’d like to live there.

Examples of Geometric/Realist/Grotesk Sans: Helvetica, Univers, Futura, Avant Garde, Akzidenz Grotesk, Franklin Gothic, Gotham.

2. Humanist Sans

These are Sans faces that are derived from handwriting — as clean and modern as some of them may look, they still retain something inescapably human at their root. Compare the ‘t’ in the image above to the ‘t’ in ‘Geometric’ and note how much more detail and idiosyncrasy the Humanist ‘t’ has.

This is the essence of the Humanist Sans: whereas Geometric Sans are typically designed to be as simple as possible, the letter forms of a Humanist font generally have more detail, less consistency, and frequently involve thinner and thicker stoke weights — after all they come from our handwriting, which is something individuated. At their best, Humanist Sans manage to have it both ways: modern yet human, clear yet empathetic. At their worst, they seem wishy-washy and fake, the hand servants of corporate insincerity.

Examples of Humanist Sans: Gill Sans, Frutiger, Myriad, Optima, Verdana.

3. Old Style

Also referred to as ‘Venetian’, these are our oldest typefaces, the result of centuries of incremental development of our calligraphic forms. Old Style faces are marked by little contrast between thick and thin (as the technical restrictions of the time didn’t allow for it), and the curved letter forms tend to tilt to the left (just as calligraphy tilts). Old Style faces at their best are classic, traditional, readable and at their worst are… well, classic and traditional.

Examples of Old Style: Jenson, Bembo, Palatino, and — especially — Garamond, which was considered so perfect at the time of its creation that no one really tried much to improve on it for a century and a half.

4. Transitional and Modern

An outgrowth of Enlightenment thinking, Transitional (mid 18th Century) and Modern (late 18th century, not to be confused with mid 20th century modernism) typefaces emerged as type designers experimented with making their letterforms more geometric, sharp and virtuosic than the unassuming faces of the Old Style period. Transitional faces marked a modest advancement in this direction — although Baskerville, a quintessential Transitional typeface, appeared so sharp to onlookers that people believed it could hurt one’s vision to look at it.

In carving Modernist punches, type designers indulged in a kind of virtuosic demonstration of contrasting thick and thin strokes — much of the development was spurred by a competition between two rival designers who cut similar faces, Bodoni and Didot. At their best, transitional and modern faces seem strong, stylish, dynamic. At their worst, they seem neither here nor there — too conspicuous and baroque to be classic, too stodgy to be truly modern.

Examples of transitional typefaces: Times New Roman, Baskerville.
Examples of Modern serifs: Bodoni, Didot.

5. Slab Serifs

Also known as ‘Egyptian’ (don’t ask), the Slab Serif is a wild card that has come strongly back into vogue in recent years. Slab Serifs usually have strokes like those of sans faces (that is, simple forms with relatively little contrast between thick and thin) but with solid, rectangular shoes stuck on the end. Slab Serifs are an outlier in the sense that they convey very specific — and yet often quite contradictory — associations: sometimes the thinker, sometimes the tough guy; sometimes the bully, sometimes the nerd; sometimes the urban sophisticate, sometimes the cowboy.

They can convey a sense of authority, in the case of heavy versions like Rockwell, but they can also be quite friendly, as in the recent favorite Archer. Many slab serifs seem to express an urban character (such as Rockwell, Courier and Lubalin), but when applied in a different context (especially Clarendon) they strongly recall the American Frontier and the kind of rural, vernacular signage that appears in photos from this period. Slab Serifs are hard to generalize about as a group, but their distinctive blocky serifs function something like a pair of horn-rimmed glasses: they add a distinctive wrinkle to anything, but can easily become overly conspicuous in the wrong surroundings.

Examples of Slab Serifs: Clarendon, Rockwell, Courier, Lubalin Graph, Archer.

3. Don’t Be a Wimp: The Principle of Decisive Contrast

So, now that we know our families and some classic examples of each, we need to decide how to mix and match and — most importantly — whether to mix and match at all. Most of the time, one typeface will do, especially if it’s one of our workhorses with many different weights that work together. If we reach a point where we want to add a second face to the mix, it’s always good to observe this simple rule: keep it exactly the same, or change it a lot — avoid wimpy, incremental variations.

This is a general principle of design, and its official name is correspondence and contrast. The best way to view this rule in action is to take all the random coins you collected in your last trip through Europe and dump them out on a table together. If you put two identical coins next to each other, they look good together because they match (correspondence). On the other hand, if we put a dime next to one of those big copper coins we picked up somewhere in Central Europe, this also looks interesting because of the contrast between the two — they look sufficiently different.

What doesn’t work so well is when put our dime next to a coin from another country that’s almost the same size and color but slightly different. This creates an uneasy visual relationship because it poses a question, even if we barely register it in on a conscious level — our mind asks the question of whether these two are the same or not, and that process of asking and wondering distracts us from simply viewing.

When we combine multiple typefaces on a design, we want them to coexist comfortably — we don’t want to distract the viewer with the question, are these the same or not? We can start by avoiding two different faces from within one of the five categories that we listed above all together — two geometric sans, say Franklin and Helvetica. While not exactly alike, these two are also not sufficiently different and therefore put our layout in that dreaded neither-here-nor-there place.

If we are going to throw another font into the pot along with Helvetica, much better if we use something like Bembo, a classic Old Style face. Centuries apart in age and light years apart in terms of inspiration, Helvetica and Bembo have enough contrast to comfortably share a page:

Unfortunately, it’s not as simple as just picking fonts that are very, very different — placing our candy cane font next to, say, Garamond or Caslon does not guarantee us typographic harmony. Often, as in the above example of Helvetica and Bembo, there’s no real explanation for why two faces complement each other — they just do.

But if we want some principle to guide our selection, it should be this: often, two typefaces work well together if they have one thing in common but are otherwise greatly different. This shared common aspect can be visual (similar x-height or stroke weight) or it can be chronological. Typefaces from the same period of time have a greater likelihood of working well together… and if they are by the same designer, all the better.

4. A Little Can Go a Long Way

‘Enough with all these conventional-looking fonts and rules!’ you say. ‘I need something for my rave flyer! And my Thai restaurant menu! And my Christmas cards!’ What you’re pointing out here is that all the faces I’ve discussed so far are ‘body typefaces’, meaning you could conceivably set a whole menu or newspaper with any of them; in the clothing analogy presented in part one, these are our everyday Levis. What of our Halloween flares?

Periodically, there’s a need for a font that oozes with personality, whether that personality is warehouse party, Pad Thai or Santa Claus. And this need brings us into the vast wilderness of Display typefaces, which includes everything from Comic Sans to our candy-cane and bunny fonts. ‘Display’ is just another way of saying ‘do not exceed recommended dosage‘: applied sparingly to headlines, a display font can add a well-needed dash of flavor to a design, but it can quickly wear out its welcome if used too widely.

Time for another clothing analogy:


(Photo credit: Betsssssy. Used under Creative Commons license.)

Betsey’s outfit works because the pink belts acts as an accent and is offset by the down-to-earthiness of blue jeans. But if we get carried away and slather Betsey entirely in pink, she might wind up looking something like this:


(Photo credit: Phillip Leroyer). Used under Creative Commons license.)

Let’s call this the Pink Belt Principle of Type: display faces with lots of personality are best used in small doses. If we apply our cool display type to every bit of text in our design, the aesthetic appeal of the type is quickly spent and — worse yet — our design becomes very hard to read. Let’s say we’re designing a menu for our favorite corner Thai place. Our client might want us to use a ‘typically’ Asian display face, like Sho:

So far, so good. But look what happens when we apply our prized font choice to the entire menu:

Enough already. Let’s try replacing some of the rank-and-file text copy with something more neutral:

That’s better. Now that we’ve reined in the usage of our star typeface, we’ve allowed it to shine again.

5. Rule Number Five Is ‘There Are No Rules’

Really. Look hard enough and you will find a dazzling-looking menu set entirely in a hard-to-read display font. Or of two different Geometric Sans faces living happily together on a page (in fact, just this week I wound up trying this on a project and was surprised to find that it hit the spot). There are only conventions, no ironclad rules about how to use type, just as there are no rules about how we should dress in the morning. It’s worth trying everything just to see what happens — even wearing your Halloween flares to your court date.

In Conclusion

Hopefully, these five principles will have given you some guidelines for how to select, apply and mix type — and, indeed, whether to mix it at all. In the end, picking typefaces requires a combination of understanding and intuition, and — as with any skill — demands practice. With all the different fonts we have access to nowadays, it’s easy to forget that there’s nothing like a classic typeface used well by somebody who knows how to use it.

Some of the best type advice I ever received came early on from my first typography teacher: pick one typeface you like and use it over and over for months to the exclusion of all others. While this kind of exercise can feel constraining at times, it can also serve as a useful reminder that the quantity of available choices in the internet age is no substitute for quality.

Other Resources

You may be interested in the following articles and related resources:

(ik) (vf)


© Dan Mayer for Smashing Magazine, 2010. | Permalink | Post a comment | Add to del.icio.us | Digg this | Stumble on StumbleUpon! | Tweet it! | Submit to Reddit | Forum Smashing Magazine
Post tags: ,

January 01, 01:02 PM

storagedude writes "Traditional databases and storage networks, even those sporting high-speed solid state drives, don't offer enough performance for the real-time analytics craze sweeping corporations, giving rise to in-memory analytics, or data mining performed in memory without the limitations of the traditional data path. The end result could be that storage and databases get pushed to the periphery of data centers and in-memory analytics becomes the new critical IT infrastructure. From the article: 'With big vendors like Microsoft and SAP buying into in-memory analytics to solve Big Data challenges, the big question for IT is what this trend will mean for the traditional data center infrastructure. Will storage, even flash drives, be needed in the future, given the requirement for real-time data analysis and current trends in design for real-time data analytics? Or will storage move from the heart of data centers and become merely a means of backup and recovery for critical real-time apps?'"

Read more of this story at Slashdot.



Repositories

Posts

Cafe insieme, Melbourne

Peppermill cafe in milsons point

Xenon cafe in crows nest… Tasty!!

cafe sofias in erko, this was fantastic but i hear there is a better one up the road…

abcdefghijklmnopqrstuvwxyz abcdefghijklmnopqrstuvwxyz