I just read an article by Sergi Mansilla, Extending JavaScript with inline unit tests, where he implements a pretty neat inline testing syntax using sweet.js, which allows writing code like this:

function square(n) {
    return n * n;
} where {
    square(2) is 4
    square(3) is 5

Obviously this isn’t really valid javascript – sweet.js is a javascript preprocessor which allows you to write macros in your code which are then rewritten as valid javascript. I liked the idea of being able to write inline tests, but wanted to be able to achieve it in pure js only, so I wrote inline, a little snippet of javascript (~20 lines) which implements this functionality. The syntax is a bit different, but I think it’s still pretty readable:

var square = function(n) {
    return n * n;
 .where(3) shouldEqual(5);

The way it works is simple. Inline adds a ‘where’ function to the prototype of Function, which returns an Inline object which records the function and the arguments and also has assertion functions (currently only shouldEqual() but it’s pretty obvious how more could be added).


– To use it with functions written like function fn() { ... } you must wrap the function declaration in brackets to avoid a syntax error: (function fn() { ... }).where(...).shouldEqual(...);
– Make sure not to call where() without following it with an assertion function as it’ll replace the original function! (ie. for the code: var fn = function() { ... }.where(...); fn will be an Inline object and not the specified function
– You can chain multiple where().shouldEqual()s on a function to run multiple tests

Inline.js is only a quick experiment, and I wouldn’t really recommend mixing tests and code, but I do think there’s something interesting about inline testing, and could see it being used for teaching or demonstration purposes. Sergi’s code definitely offers an advantage that as it’s written as a macro you can write the tests inline but strip them out for production.

It’s been a long time since I’ve written anything on this blog, which I really regret. I’ve got a lot going on in my life at the moment: I’ve got a 9 month year old daughter who takes up a lot of time, am in the process of planning to build a new home, and am currently working on a new project (QuickAnalyses) with some guys I met at the Launch 48 event in Exeter (which was a really enjoyable and worthwhile event that I had hoped to blog about but ran out of time!). On top of all of this, I’ve started working for a new company and my workload has really picked up, leaving me very little free time.

However, I don’t want this blog, or my efforts at ‘extra curricular’ projects, to die out. And for this reason, I want to put down in public some ideas that I hope to work on in the coming months. At the very least, I hope these will guilt me into making some extra free time for myself, and perhaps give me something to blog about in the process:

Inspired by this poll on Hacker News, this is the project that I’m really aching to get cracking on. Basically it would be a website that allowed people to pledge money for the completion of issues on open source projects on github that they really want resolving. I think this could be a worthwhile project and allow those who don’t necessarily have the ability to fix an issue themselves to incentivise someone who could to do it.

Update: Looks like I jumped the gun on this one, and a quick google could have saved me some trouble. Looks like someone’s beaten me to it, and to add insult to injury, used the exact same name I came up with! Guess I’m not as creative as I’d though! Looks promising though, check it out at gitbounty.io

This one would be personally useful to me at work, as we’re currently exploring new ways of managing products within the scope of a brand new company, and is basically a client-facing project management/issue tracker tool, allowing for a single place for customers to get quotes for new work, file bugs and communicate directly with developers. A bit of googling turned up Duet which is very close to what I want, but I want this to be completely free and open source, and also run on .net (rather than php which I believe Duet is built on). Also, it claims to be ‘beautiful’ but looks to me kinda 90s (especially with those nasty logos). Though not the most exciting of projects, it would be useful for me on a day-to-day basis, and would also be a chance for me to dig in to some new technologies – namely AngularJS to build a dynamic, SPA frontend, and RavenDB (which I’ve never had the opportunity to use) for the database.
I use dotless on almost all of my projects and love it, but I’m really intrigued to use Stylus – another css preprocessor that looks pretty ace. I’ve not yet used it though as there is no really nice way to use it from within a .net project (yes, there are ways but I don’t particularly like them). I also like the look of axis css, which is part of the roots project and builds upon stylus, but haven’t yet had a chance to use it.

Occasionally when using javascript you’ll find that you want to intercept a function call, usually to run some code before invoking functions belonging to a third-party codebase. Essentially, this is achieved by saving a reference to the original function and then replacing it with a new function which, at the end, calls the original. Something like:

(function () {
    var originalFunction = Object.getPrototypeOf(someObject).someFunction;
    Object.getPrototypeOf(someObject).someFunction = function () {
        // intercept code here

Likewise, interception is possible if a property has a getter or a setter:

(function() {
    var propDesc = Object.getOwnPropertyDescriptor(Object.getPrototypeOf(someObject), "someProperty"),
        originalGetter = propDesc.get,
        originalSetter = propDesc.set;

    propDesc.set = function(val) {
        // getter intercept code here
        return originalSetter.call(this, val);

    propDesc.get = function() {
        // setter intercept code here
        return originalGetter.call(this);

    Object.defineProperty(Object.getPrototypeOf(someObject), "someProperty", propDesc);

As you can see this is basically the same; the only real difference is that you need to use Object.getOwnPropertyDescriptor / Object.defineProperty rather than being able to set it directly using regular assignment. This is because var getter = someObject.someProperty will return the result of the getter function, and someObject.someProperty = "something" will invoke the setter function.

Before I start, I want to go on the record to say that I do not agree with using function overloads in javascript. Javascript doesn’t really support function overloading, and overloaded functions (at least in my experience) seem to lead to unclear code; the better solution is to pass in an object and determine behaviour from that.

That said, the other day I did have to implement function overloading, for a new project I started which aims to simplify basic animations when using paper.js. I wanted my solution to integrate nicely with existing code written for paper, so I decided to use a proxy object which intercepts the tranform function calls on an item and animates it, and for this reason my proxy object needed to implement matching method signatures.

Typically when people use function overloading in javascript, they are dealing with different numbers of arguments, but paper.js complicates things slightly as there can be ambiguities in function calls which can be resolved not by counting the number of arguments but by looking at the “types” of the arguments.

  • scale(Number:scale [, Point:center])
  • scale(Number:hor, Number:ver[, Point:center])

When calling something.scale(a,b), we cannot know which function to use without further inspecting b. I briefly googled function overloading in javascript and came up with a few solutions that dealt with overloads which had different numbers of overloads, but nothing that dealt with ambiguities such as the one above, and so I came up with this method:

function selectOverload(that, args, overloads) {
	var types = {
		Number: function(val) { return !isNaN(parseFloat(val)) && isFinite(val); },
		Point: function(val) { return val.x !== undefined && val.y !== undefined; }
	for (var o = 0; o < overloads.length; o++) {
		var overload = overloads[o],
			matches = true;
		if (args.length > overloads[o].params.length) continue;
		for (var a = 0; a < args.length; a++) {
			if (!types[overload.params[a]](args[a])) {
				matches = false;
		if (matches) { return overload.fn.apply(that, args); }

And it gets used like this:

AnimationProxy.prototype.scale = function() {
		this, arguments,
			params: ["Number","Point"],
			fn: function(scale, center) {
				// code for this overload here
			params: ["Number","Number","Point"],
			fn: function(hor, ver, center) {
				// code for this overload here
	return this;

Although not perfect, I’m pretty pleased with this code. I think the intention is pretty clear, and like that the arguments are correctly named. It doesn’t actually check the type of the argument; the “type definitions” are actually just boolean functions which return true or false if an argument meets their “type criteria”, and so this can easily be used on a whole variety of scenarios.

Also, I made the decision to pass a string value in for the type and have the type defs in the selectOverload function, but that’s just because it suited the situation I was working with, but there’s no reason you can’t adapt the code to use the criteria function itself. In my situation I didn’t feel it would be as readable, but if you have a lot of different types then it may work out more efficient.

Consider that you have two objects, Parent and Child. A child has a parent, and a parent maintains an IEnumerable of children.

class Parent
    public IEnumerable Children { get; set; } 

class Child
    public Parent Parent { get; set; } 

Given a list of child ids, you need to populate the parent with a list of Child instances, and also set the parent of each child.

Method 1: foreach

First option is to use foreach to iterate through each id in the list, get the child, set the parent on it and add it to the parent’s list of children. Since IEnumerable doesn’t support an Add() method, we need to modify Parent to take a List<> or Collection<> instead, and then we can use the code:

foreach (var id in childIds)
    var child = GetChild(id);
    child.Parent = parent;

Now I guess there’s nothing wrong with this; it does get the job done, but when converting a list of something to a list of something else, it seems really wrong not to use linq. Plus, if you were unable to modify the Parent class to use a data type that supported Add(), then this would be out of the questions.

Method 2: linq

parent.Children = childIds.Select(id =>
    var child = GetChild(id);
    child.Parent = parent;
    return child;

As you can see, although this code has been modified to use linq, the lines match almost one-to-one. We can now, however, leave the Parent’s list of children unchanged, as an IEnumerable. I personally would rather use a single line lambda expression rather than a multiline closure like this though, as I feel they read much better, and make it far clearer to see what the intended action is.

Method 3: object extension

There is a useful linq method, ForEach(), which allows the execution of an action on each item of a list. It returns void, but the behaviour is fairly basic so it is easy enough to knock up a quick extension method to do it. I want to be able to reuse this code, as the ability to update an object within a linq query would be a useful one, so rather than extending Parent you can extend object:

public static object SetProperty(this object entity, Action<object> setter)
    return entity;

This allows for the following code:

parent.Children = childIds
                  .Select(id => GetChild(id).SetProperty(c => (c as Child).Parent = parent))
                  .Select(o => o as Child);

Eeesh. The idea worked in principle, but clearly using an object extension is not the way to go! While I like having the ability to chain the method, it’s not nice to have the nested cast to access the parent property. Furthermore, since the extension method returns object, a further Select was required which simply cast the list items back to Child.

Method 4: generic extension

The final solution is, therefore, fairly obvious, and makes for an elegant and easy-to-read solution.

public static T SetProperty<T>(this T entity, Action<T> setter)
    return entity;
parent.Children = childIds
                  .Select(id => GetChild(id).SetProperty(c => c.Parent = parent));

I’ve just read a post by Alex Maccaw, 5 APIs that will transform the Web in 2013, and think that while the APIs described are all pretty cool, my personal favourite has not been mentioned: web intents. The specification is currently in the ‘working draft’ stage and so chances are it still won’t be finalised next year, which probably means browser coverage will be lacking (particularly in the IE camp), but Firefox and Chrome are already picking it up and so at least that’s something!

Web intents is inspired by Android’s Intents framework, and is a framework that allows for web-based inter-app communication and service discovery. The example that is typically bounded about when talking about web intents is photo editing. Say, for example, I build a web application that allows users to upload photos, and I want to enable them to manipulate the photo on my site. I could spend a lot of time trying to develop my own editing system, but the likelihood is that it will be buggy, or lacking in features, because this is simply not my area of expertise. Using the web intents framework, however, I could simply integrate with a third-party photo editing application which is hopefully less buggy and more feature packed than anything I could code myself. Furthermore, since all my application is doing is declaring that it needs a certain type of service, rather than specifying one service specifically, the user is able to select whichever photo editing application suits him best.

This example works well, given the difficulties that can arise for the average developer dealing with advanced image manipulation techniques, but the web intents system specification is open enough that anyone can register a service for any intent that they wish. Take a look at the demos on the webintents.org for some different sample applications, as well as to get an idea of how this system will work. I think it’s really exciting, and will make it easier for us all to develop richer web applications going forward.

If ever you accept user-written HTML code in your web applications, such as may be generated in a rich-text ‘wysiwyg’ text editor, it is vital that before displaying it back anywhere you first sanitize it. Sanitization is the process of removing potentially malicious code, primarily to prevent xss (cross-site scripting) attacks; and is generally achieved by allowing only a subset of tags and attributes in the submitted code and removing or encoding the rest.

I recently needed to do this, and a quick google turned up a project, patapage, which does just that. Although it is a java solution, there is a C# port written by Beyers Cronje; unfortunately it’s some seriously ugly code, being more-or-less a straight rip of the java version, just fixed to be valid c# code.

I realise that some people don’t think that is necessarily a bad thing, but I can’t stand to see eyesores, such as lowercase methods and type names instead of ‘var’, and so had to clean it up. I take no credit for any of the code, all I did was capitalize property/method names; replace some if/elses with ternary operators for terseness (where appropriate); replace type names with ‘var’; and change some arrays to IEnumerables (I hope this should give a bit of a performance gain, but I didn’t bother to check so don’t quote me on that).


I­­­ read an article the other day by Nicholas C. Zakas titled ‘The Problem with Native Javascript APIs’, and found it thoroughly depressing.

“Browsers are written by humans just like web pages are written by humans. All humans have one thing in common: They make mistakes. Browsers have bugs just like web pages have bugs just like any other software has bugs. The native APIs you are relying on likely have bugs”

I’m not totally naïve, of course I understand that it is rare that a code base will be bug-free, but I find it a real shame that someone like Nicholas, a respected authority for javascript who strongly advocates best-practise methodologies, would be so dismissive of native code within the browser. These APIs are in there to help us, as developers, and as a result of being coded natively are typically considerably faster than equivalent javascript would be.

So why does he have this attitude? It’s not totally unfounded, and he does present one example of a native API which had different bugs in the implementation Firefox and WebKit, which we know from history is just one of many browser bugs.

His solution to avoiding native APIs, then? Write it yourself, of course!

There are two ways that you can rewrite a native API: either by using a façade, such as jQuery, which provides an alternate interface to existing code; or a polyfill, such as Modernizer, which attempts to implement native API functionality which may be missing. Nicholas advocates facades, as polyfills “represent yet another implementation of the same functionality”. I don’t totally understand this, as it seems that facades do the exact same thing, just with a different interface, but that’s neither here nor there, as it seems to me that both have their place within a code base.

The final solution presented recreates the functionality of the native API, but without using it directly. This to me stinks of reinventing the wheel. Furthermore, I think it’s downright arrogant to assume that your code is somehow impervious to bugs. Imagine if we all had this viewpoint, and used no third party code at all. The reason we use third party libraries and frameworks is because they allow us to concentrate on the code that is relevant to us. If you know there is a bug in some code, don’t waste your time by duplicating the functionality and adding to your own codebase, let the developers know: file bug reports, email them, tweet. Get it fixed and help everybody.*

*Interestingly enough, the author has even noted that the bugs mentioned in the case study have both been sorted! Think about how many of us developers are using browsers: it’s far more likely that a bug in Chrome will be noticed, for example, than a bug in your code.

I was flicking through an old notebook that I used at for uni notes the other day, and I stumbled across my overly-simplistic guide to Principal Component Analysis: PCA for Morons. It’s a really cool data-mining technique, so I figured it would be worth fleshing out to be slightly more detailed than a few short bullet points!

What is Principal Component Analysis and Why is it Useful?

Data mining is the process of discovering patterns within large volumes of data. This data typically contains redundant information, associations that would be difficult to spot due to high numbers of variables, and lots of noise; making it virtually impossible to detect these patterns without the use of statistical/artificially intelligent processing systems.

Essentially, PCA aims to transform a data set by creating a new smaller set made up from combinations of the original data, known as Principal Components. This has the result of shrinking the size of the search space, making predictions / analysis easier, and the principal components are ordered to indicate the most interesting patterns identified. PCA does lead to a loss of information, though it is typically minimal, and the dataset can be reconstructed afterwards, so it can be used as a lossy compression method too.

The Basic Gist of PCA Dimension Reduction

Correlations between v1 & v2, and v1 & v3

The awesome chart to the left (yes, that is done with the airbrush tool in MS Paint) attempts to show how this reduction can be possible: The left hand side plots the variables v1 and v2, which as you can see have no correlation – if you know v1, it would be very difficult to predict v2. The right hand side, however, shows two variables (v1 and v3) which show a strong correlation – as v1 grows, so does v3, making predictions much easier. Because of this strong relationship we can transform the data and combining the data from multiple axes down to one single axis.

This process is performed for all dimensional pairs within the data, and then all related variables are reduced. This process isn’t limited to reducing just two dimensions, but can reduce any number, as long as they are all sufficiently correlated.

The Data

In this post I’ll go perform PCA on a data set made up of 15 three dimensional items. This isn’t real data, I made it up for the sake of the example, and is only three dimensional for speed and clarity of concepts – but there is no limit to the dimensionality of data when applying PCA to a real dataset.
Before performing PCA the data needs to be mean adjusted, so that the mean of the whole data set is 0, and this is achieved by simply calculating the mean across each dimension and subtracting it from each item.

Raw Data Mean Adjusted Data
v1 v2 v3 v1′ v2′ v3′
5.60 5.10 8.80 2.44 2.01 3.53
1.00 0.80 2.40 -2.16 -2.29 -2.87
2.30 2.20 4.20 -0.86 -0.89 -1.067
3.10 3.00 6.00 -0.06 -0.09 0.73
2.50 2.70 2.60 -0.66 -0.39 -2.67
1.20 1.50 2.20 -1.96 -1.59 -3.07
4.80 4.50 8.70 1.64 1.41 3.43
4.00 4.40 7.80 0.84 1.31 2.53
3.90 3.80 6.30 0.74 0.71 1.03
1.50 1.20 2.00 -1.66 -1.89 -3.27
3.90 4.10 4.50 0.74 1.01 -0.76
3.40 3.50 6.80 0.24 0.41 1.53
3.70 3.00 5.90 0.54 -0.09 0.63
2.10 2.50 3.80 -1.06 -0.59 -1.47
4.40 4.10 7.00 1.24 1.01 1.73

Covariance Matrix

As I said before, PCA compares all pairs of dimensions in the data in order to find correlations within the data, and this is acheived by measuring the covariance – or the variarance of one dimension with respect to another. Variance is a measure of the spread of the one-dimensional data, and is calculated by dividing the total of all the data points less the mean of the data set by the number of data points minus 1 (or standard deviation squared):

Covariance is very similar, but is calculated using two variables (v1 and v2) instead of just the one (v):

To compare all pairs of dimensions within the data, we can construct a covariance matrix, of the form:

Note: if you’re calculating this yourself, notice that it is symmetrical [ie. cov(x,y) == cov(y,x)] so you can save yourself some computation time and just calculate half of it

Eigenvalues and Eigenvectors

Full disclosure: this bit is magic. Or at least if there is some mathematical reasoning behind it, I don’t know it.

With our covariance matrix, we can get some useful numbers out of it, known as eigenvalues, and some useful vectors, known as (surprise, surprise) eigenvectors. These can only be found in square matrices, and of an n x n matrix, there will be n pairs of  eigenvalues and eigenvectors (with each eigenvector representing n dimensions (x1, x2 … xn). I shan’t go into too much detail about what they are / how they are calculated, because they are a bit of a mystery to me, but most maths packages should be able to calculate them for you. The only point you may need to bear in mind is that PCA requires unit vectors – that is the vectors should be normalized so that they are all of length 1. Fortunately, most maths packages will return unit vectors, but it’s worth checking if you are unsure.

For the dataset above, the eigenvalues are: 8.62, 0.35 and 0.05; and the eigenvectors are: [0.45,-0.50,-.73], [0.42,-0.61,0.67] and [0.79,0.61,0.65].

The eigenvectors characterise the data, and the corresponding eigenvalue indicates just how representative of the data the vector is. If you order the eigenvectors by eigenvalue and plot on a scatter graph of the data then you can see that the first vector (the principal component) should pass through the centre of the points like a line of best fit, with each corresponding vector having less significance than the one before it.

Apologies, I realise that the above plot is hardly clear, but it shows each of the 15 points plotted with the original axes (v1, v2 and v3 – in blue) and the eigenvectors (v1′, v2′ and v3′ – in purple). If there were more data points – and I had access to a better charting library – then it’d be much more apparent, but for now you’ll just have to trust that it works!

The Good Bit: Transforming the Original Data

The final step of PCA is to reduce the dimensionality of the data based on the eigenvalues calculated, by selecting only the top eigenvalue/eigenvectors. As the sample data has a clear pattern, the eigenvalues clearly show one vector is far more representative of the data than the others, but it can be far tighter. In these cases, you can either manually select the threshold, or use some thresholding algorithm to determine the cut-off point. Calculating thresholds is a really big topic in AI, with many different approaches, but one simple technique I like (and I think works well in situations like this) is to calculate the standard deviation of all of the eigenvectors, and subtract it from the first eigenvalue.

When you have selected the eigenvectors that you will use, you must construct a feature vector, which is essentially just a matrix composed of vector columns (v1, v2 … etc) and then transposed. The final data is then simply the feature vector multiplied by the mean adjusted data.

Here is the plot of the new data transformed using two eigenvectors, resulting in two dimensional data. Plotting the data in this way clearly shows that there is a strong pattern in the data – although this was visible in the 3d plot, it would be impossible to plot, for instance, a 20-dimensional plot. PCA also helps to remove redundant information which contains very little information. As you have seen, performing PCA is a reasonably simple affair, but is a powerful tool when trying to simplify a large and complicated dataset.

TL;DR – Solution Below

Version 2.1.0 of Twitter’s awesome css/javascript framework Bootstrap was released a couple of days ago, and so I took the opportunity today to upgrade a few of our projects that are using it from version 2.0.4. We take full advantage of Bootstrap by using less css, and compile it with .less (pronounced dotless, and conveniently available via NuGet); not only to ease with customization, but also to allow use of the helpful mixins provided.

Unfortunately, there are (at the time of writing this) errors in the v2.1.0 less files, and when I replaced them the pages rendered with no styling, and inspecting the compiled files showed that they were being returned as blanks.

Step 1: The Binary Chop

I’ve never had to diagnose .less errors before, my prior experience has pretty much entirely consisted amending existing less files and throwing mixins into my styles, so the debugging process started – as so many hacky debugs do – with the binary chop. Although it’s not a pretty debug technique, it did enable me to quickly find that the issue was caused by the variables.less file. Unfortunately (thought it’s behaviour we’d usually want), .less caches the compiled css, and so a rebuild was required after each change to regenerate the css.

Step 0.9: Disable Caching

Man I wish I’d known this one before beginning step 1. We installed .less using NuGet, which adds the correct configuration to the Web.Config which makes it all Just Work – which is a great thing, but came round to bite me when I realised that I’d never actually taken the time to look it.

<?xml version="1.0" encoding="utf-8"?>
    <section name="dotless" type="dotless.Core.configuration.DotlessConfigurationSectionHandler, dotless.Core" />
  <dotless minifyCss="false" cache="true" />

cache=”true” !? Oops, turned that off. Great, no more rebuilding!

Step 2: Closer Look at variables.less

So I knew where the problems were in , but still didn’t know what the problems actually were. Unfortunately, this required more hacky debugging, commenting out code to make it compile and uncommenting it chunk-by-chunk until it broke again. I quickly found the first error, in the Navbar variables:

// Navbar
@navbarBackground:                darken(@navbarBackgroundHighlight, 5%);
@navbarBackgroundHighlight:       #ffffff;

See it? The issue is caused by the @navbarBackgroundHighlight variable being used before it is defined. To be honest I’m not entirely sure whether this is a less syntax error, or with the .less compiler implementation, but it was this that was causing the compiler to return blank css.

This was an awkward thing to find, and I expected that since it was in the code once, there were probably multiple instances of the bug, and so I decided that it was time to look into seeing if it was possible to log .less compiler errors.

Logging .less Compiler Errors

It was possible, and really easy – just a touch more configuration needed in the Web.Config:

<dotless minifyCss="true" cache="false" logger="dotless.Core.Loggers.AspResponseLogger" />

Even nicer, I’d expected the errors to be logged to disk somewhere, but conveniently they were viewable in the bootstrap.less file when inspecting it in the browser, and as expected this made it really quick and easy to identify and fix the errors.

variable @dropdownLinkBackgroundHover is undefined on line 94 in file 'less/dropdowns.less':
[93]: background-color: @dropdownLinkBackgroundHover;
[94]: #gradient > .vertical(@dropdownLinkBackgroundHover, darken(@dropdownLinkBackgroundHover, 5%));
[95]: }
from line 94:
[94]: #gradient > .vertical(@dropdownLinkBackgroundHover, darken(@dropdownLinkBackgroundHover, 5%));

Final .less Config:

<?xml version="1.0" encoding="utf-8"?>
    <section name="dotless" type="dotless.Core.configuration.DotlessConfigurationSectionHandler, dotless.Core" />
  <dotless minifyCss="false" cache="false" logger="dotless.Core.Loggers.AspResponseLogger" />
Update: Before writing this blog post I filed an issue on the Bootstrap github, and less than five minutes after finishing it the issue was closed: it seems the variable ordering bug was found and fixed already. Still, I’m glad it gave me the opportunity to dig deeper into debugging .less