Softwire Blog


Coaching open source contributions with Codebar


20 June 2017, by

Codebar are an organisation that run weekly events to help get people from groups who are underrepresented in tech into coding. And for the past two years they and Ladies of Code have jointly run a day-long workshop in December as part of 24 Pull Requests, where more experienced devs volunteer to help people to make their first open source contribution.

Lightning talks at the Codebar 24PR event

Photo by @codebar

I coached at the 2015 event, which was a lot of fun. For 2016, Softwire also sponsored the event as part of our drive to support diversity in tech, so as well as coaching, I got to tell a diverse audience of people about why they should apply to Softwire!

There were several lightning talks about open source during the day, including ones from Andrew Nesbitt (who started 24 PR) and Charlotte Spencer (who runs @yourfirstpr).

But most of the day was spent coding. In the morning, the coaches were paired up with students according to what languages we knew or wanted to work with. JavaScript was by far the most popular, with a few people using Ruby or Python, and one person who wanted to write some Java.

I paired with Anna and Bybreen, who were just starting out with learning HTML and CSS. Git was the steepest learning curve, because neither of them had used it before, so we started out using the GitHub desktop client to make things a bit less intimidating.

Writing open source code at the Codebar 24PR event

Photo by @binusida

It was quite tricky to find issues to work on (which is one of the hardest things about 24 Pull Requests in general), but someone suggested taking a few open issues on Prepare to Code, a website which provides beginners’ guides for setting up different kinds of dev environments. The issues were just typos and broken links, but it was perfect for us because there was no dev environment to set up. Once they’d got into the swing of things, I hunted around for more things that they could fix on the same site. Hooray for buggy code!Between them, they made six pull requests, which is pretty impressive for people who were completely new to git. And we hit the 24 pull requests goal as a group.The day wound down with some drinks and a group chat about the things we’d been working on and what we’d learned.It was a really good day. Everyone there was so enthusiastic about learning or teaching (or both), and it was great to be able to help people see open source (and coding in general) as more approachable.

 

Server Components: Web Components for Node.js


11 July 2016, by

Web components logo

We at Softwire are big on open source. Many of us are involved in a variety of open-source projects, and some of us are even running our own, from database setup tools to logging libraries to archiving tools to HTTP performance testing and Roslyn-based code generators.

In this post I want to take a look at a recent new addition to this list: Server Components, a small library I’ve been working on to let you render web components entirely on the server-side.

Web components, for those not familiar, are a series of new specs coming (slowly) to browsers everywhere. It’s actually a set of 4 new features (Custom Elements, HTML Templates, Shadow DOM and HTML Imports), which together provide a fantastic, fast, built-into-the-browser way of defining brand new elements. You define your elements with their own internal rendering logic, presentation and behaviour, and then compose them together to build your page. Building your web page becomes a process of defining or just downloading small composable blocks, and then declaratively clicking them together.

There’s more to this than I can cover in this post, but if you’re interested in an intro then Smashing Magazine’s guide is a pretty good place to start.

These are a fantastic set of technologies if you’re looking for a nice way to build web applications on the client-side. They give you a tool to transform page build and JavaScript from being a fundamentally mess unstructured process (one enormous DOM, with everything operating at the same level of abstraction), to a manageable modern process, letting you layer abstractions atop one another and build your applications from many small standalone parts.

With web components, you can build web pages from independent encapsulated chunks of behaviour and UI. Within these web components, you can then nest other components, or pass them other components as input. Building a UI becomes much easier when your toolbox looks like:

 

<loading-spinner></loading-spinner>

<login-form action="/login"></login-form>

<qr-code data="http://example.com"></qr-code>

<calendar-date-picker value="2016-01-14"></calendar-date-picker>

<google-map lat=1234 long=5678 zoom=12></google-map>

<confirmation-dialog>
  Are you sure you want to quit?
</confirmation-dialog>

<item-paginator>
  <an-item>Item one</an-item>
  <an-item>Item two</an-item>

  ...

  <an-item>Item one thousand</an-item>
</item-paginator>

 

It’s a powerful paradigm, and this is really a standalone native version of the core feature that front-end frameworks like Ember, Angular and React have given us in recent years: the ability to powerfully compose together web pages quickly and easily out of smaller parts, while providing enough structure and encapsulation to keep them maintainable.

At the moment though, it’s a paradigm largely confined to the client-side. Clicking together server-side user interfaces has never really become an effective or enjoyable reality. You’re left with larger MVC frameworks that dictate your entire application structure, or trivial templating libraries that rarely offer more than token encapsulation or reusability. If you want a great experience for web UI development, you have to move to the client side.

Server Components aims to solve this. It’s taking the core of the client-side development experience, and moving it onto the server. Many web sites don’t need client-side interactivity but end up developing for client-side applications (and paying the corresponding costs in complexity, SEO, performance and accessibility) to get a better developer experience. Server Components is trying to give you that same power and flexibility to quickly build user interfaces on the web, without the client-side pain.

Screenshot of Tim.FYI

This is still early days for server components; I’m aiming to expand them significantly further as feedback comes in from more real world usage, and they’re not far past being a proof of concept for now. As a public demo though, I’ve rebuilt my personal site (Tim.FYI) with them, building the UI with delightful components like:

 

  <social-media-icons
    twitter="pimterry"
    github="pimterry"
    linkedin="pimterry"
    medium="pimterry"
  ></social-media-icons>

and even more complex ones, with interactions and component nesting:

 

  <item-feed count=20>
    <twitter-source username="pimterry" />
    <github-source username="pimterry" type-filter="PullRequestEvent" />
    <rss-source icon="stack-overflow" url="http://stackoverflow.com/feeds/user/68051" />
  </item-feed>

 

All entirely server-rendered, declarative and readable, and with zero client-side penalties. JavaScript is added on top but purely for extra interactivity and polish, rather than any of the core application functionality. Take a look at the full HTML, tiny chunk server code and the component sources to get a feel for how this works under the hood.

Interested? Want to go deeper and try this out yourself? Take a look at Server Components on Github.

Alternatively if you’d like some more detailed explanation first, take a look at the full introduction on Medium, complete with worked examples. If you’d like to hear even more about this, I’ll be talking to the Web Platform Podcast in more detail on July 17th, and feel free to send questions in on Twitter or in the comments below.

Improving Open-Source Deployment with Docker


6 April 2016, by

Open-source software has a lot going for it, but easy of use is not typically at the top of the list. Fortunately, it’s rarely a problem; as developers much of the open-source code we use is in simple tools and libraries, and most of the core interactions we have these are managed by package managers, which have focused on building a convenient usable layer to manage this for any project.

That’s not the case for other domains though, especially non-trivial standalone applications. There are a lot of popular open-source tools that follow this model, and require you to install and run them in an environment providing all their core dependencies. WordPress (which runs this blog) is a good example, along with apps like Discourse (a forum we use for internal discussion). These provide great value and they’re great tools, but setup isn’t easy, often involves many manual steps following sometimes painful documentation, and typically then fails because of some inexplicable idiosyncrasy of the server you’re using.

Staytus is another good example of this. Staytus is an open-source web application that provides a status site for your product, aiming to be a beautiful, usable, easy to manage tool that companies can drop into place to give their customers information on how their system is doing. Take a look at their demo site to see it in action.

Staytus, like many other tools in this domain, isn’t effortless to set up though. You have to install the right version of Ruby and all their ruby dependencies (still an annoyingly fiddly process on Windows especially, if you’re not already using Ruby elsewhere), install Node, install, configure and prepare a MySQL server, configure Staytus to glue this all together, and then hook the Staytus startup commands into whatever service running tool you want to use. None of this is cripplingly difficult, but it’s all friction that gets in the way of putting Staytus into users’ hands.

I found Staytus recently, while looking out for exciting new open-source projects, and decided this needed fixing. I wanted to try it out, but all the above was hassle. It would be better if you could run a single command, have this all done for you, and get a server up straight away. I’d also been hankering to have a closer look at Docker, which felt like a great fit for this, so I dived in.

The steps

So, we want to provide a single command you can run which downloads, installs, configures and starts a working Staytus server. To do that, we need a few things:

  • A working Ruby 2.1 environment
  • All the required Ruby dependencies
  • Node.js (for Rails’s JS asset uglification)
  • A configured MySQL server
  • Staytus configuration files, including the MySQL server details
  • A startup script that prepares the database, and starts the service

Automating this with Docker

Docker lets us define an immutable machine image, and provides extremely fast and convenient mechanisms to share, update, and use these images.

In practice you can treat this like an incredibly good virtual machine management system. In reality under the hood the details are quite different – it is providing isolated containers for systems, but through process isolation within a single operating system, rather than totally independent machines – but while those differences power the benefits, they doesn’t really need to affect how you think about the basics of using Docker in practice.

I’m not going to go into the details of Docker in great depth here, I’m just going to look at an example real-world use, at a high-level. If you’re really interested, take a look at their introductory video, or read through their excellent documentation.

What we need to do is define a recipe for an image of a machine that is configured and ready to run Staytus, following the steps above. We can then build this into an actual runnable machine image, and hopefully then just start it to immediately have that machine ping into existence.

Dockerfile

To start with we need the recipe for such a machine. That recipe (the Dockerfile), and the startup script it needs (a simple bash script) are below.

It’s important to note that while this is a very effective & working approach, there are parts of this that are more practical than they are Docker Best Practice. We’ll talk about that later (see ‘Caveats’).

# Start from the standard pre-prepared Ruby image
FROM ruby
MAINTAINER Tim Perry <[email protected]>

USER root

# Run all these commands inside the image
RUN apt-get update && \
    export DEBIAN_FRONTEND=noninteractive && \
    # Set MySQL password to temp-pw - reset to random password later
    echo mysql-server mysql-server/root_password password temp-pw \
      | debconf-set-selections && \
    echo mysql-server mysql-server/root_password_again password temp-pw \
      | debconf-set-selections && \   
    # Install MySQL for data, node as the JS engine for uglifier
    apt-get install -y mysql-server nodejs
    
# Copy the current directory (the Staytus codebase) into the image
COPY . /opt/staytus

# Inside that directory in the image, install our dependencies
RUN cd /opt/staytus && \
    bundle install --deployment --without development:test

# When you run this image, run docker-start.sh
ENTRYPOINT /opt/staytus/docker-start.sh

# Persist the MySQL DB to an external volume
# This means it can be independent of the life of the container
VOLUME /var/lib/mysql

# Persist copies of other relevant files (config, custom themes).
# Contents of this are copied to the relevant places when the container starts
VOLUME /opt/staytus/persisted

EXPOSE 5000

With this saved as Dockerfile inside the root of the Staytus codebase, we can then run docker build . to build (or rebuild) an image following this locally.

An interesting consideration when writing these Dockerfiles is image invalidation. Docker builds intermediate images for each command here, and rebuilding an image only reruns the steps that have been invalidated, using as many from its cache as possible. That means that by writing the Dockerfile as above rebuilding a new image with changes to the Staytus codebase is very cheap; the Ruby, Node and MySQL installation and setup phases are all cached, and we just take that image, copy the new code in, and pull down the dependencies the current codebase specifies. We only rerun the parts from COPY . /opt/staytus down. Small tweaks like this make iterating on your Docker image much easier.

Take a look at this article about working with the Docker build cache if you’re interested in this (and don’t forget to look at Docker’s best practices guide generally)

docker-start.sh

That Dockerfile installs everything required, copies the codebase into the image, and tells Docker to run the ‘docker-start.sh’ script when the image is started as a container.

To actually use this, we need a docker-start.sh script, to manage service startup process. That full content of that is below.

Note that this script includes some further database setup that could have been done above, at image definition time. That’s done here instead, to ensure the DB password is randomized for each container not baked into the published image, so we don’t end up with Staytus images all over the internet running databases with identical default passwords. Docker doesn’t obviate the need for good security practices!

#!/bin/bash
/etc/init.d/mysql start # Start MySQL as a background service

cd /opt/staytus

# Configure DB with random password, if not already configured
if [ ! -f /opt/staytus/persisted/config/database.yml ]; then
  export RANDOM_PASSWORD=`openssl rand -base64 32`

  mysqladmin -u root -ptemp-pw password $RANDOM_PASSWORD
  echo "CREATE DATABASE staytus CHARSET utf8 COLLATE utf8_unicode_ci" | mysql -u root -p$RANDOM_PASSWORD

  cp config/database.example.yml config/database.yml
  sed -i "s/username:.*/username: root/" config/database.yml
  sed -i "s|password:.*|password: $RANDOM_PASSWORD|" config/database.yml

  # Copy the config to persist it, and later copy back on each start, to persist this config 
  # without persisting all of /config (which is mostly app code)
  mkdir /opt/staytus/persisted/config
  cp config/database.yml /opt/staytus/persisted/config/database.yml

  # On the first run only, run the staytus:install task to setup the DB schema.
  bundle exec rake staytus:build staytus:install
else
  # If it's not the first run:

  # Use the previously saved config from the persisted volume
  cp /opt/staytus/persisted/config/database.yml config/database.yml

  # The DB should already be configured. Check if there are any migrations to run though:
  bundle exec rake staytus:build staytus:upgrade
fi

# Start the Staytus service
bundle exec foreman start

Putting this to use

With this written, you can check out the Staytus codebase, run docker build . to build an image of Staytus, and run docker run -d -p 0.0.0.0:80:5000 [built-image-id] to instantly start a container with that image, listening locally on port 80.

For end users, that a lot easier than all the previous setup we had! There’s still a little more we can do though. Having done this, we can publish that image to Docker Hub, and users no longer need to check out the codebase at all.

The full setup now, from a blank slate, is:

  • Install Docker (a single standard platform-specific installer)
  • Run docker run -d -p 0.0.0.0:80:5000 --name=staytus adamcooke/staytus
  • Browse to http://localhost:80

(Note the ‘adamcooke/staytus‘ part; that’s the published image name)

This is drastically easier than following all the original steps by hand, and very hard to do wrong!

I wrote this all up, contributed this back to Staytus itself in July last year, Adam Cooke (the maintainer of Staytus) merged in and published the resulting image, and Staytus is now ready and available for quick easy use. Give it a go!

Caveats

Some is this is not exactly how things should be done in Docker land – there’s more than a few concessions to short-term practicality – but this does work very nicely, and provides exactly the benefits we’re looking for.

The key Docker rule that’s not being follow here is that we’ve put two processes (Staytus and its MySQL server) into a single image, to run as a single container. Instead, we should run two containers (a Staytus container, and a totally standard MySQL container) and link them together, typically using Docker Compose. At the time though Docker Compose wasn’t yet recommended for production use, and to this day moving to this model still makes it a little harder for users to get set up and running that it would be with the one image. There’s ongoing work to finish that up now though, and Staytus is likely to evolve further in that direction soon.

24 Pull Requests 2015


20 January 2016, by

At Softwire we make an effort to be involved in the open-source community that powers so much of what we do. For the last three years, we’ve spent December getting involved in 24 Pull Requests – an initiative encouraging developers to contribute pull requests to projects they use and love through the run up to Christmas – and this year is no exception!

Even more than last year though, this year has been an enormous success. We’ve even managed to bring ourselves up into an amazing 4th place position overall, out of the 9,934 organisations taking part. (This may have been strongly encouraged by the suggestion of a team croquembouche appearing if we reached the top 5. We at Softwire are certainly not immune to the charms of an exciting cake!)

Particular commendation should go to the top contributors, each of whom put in:

  1. Hereward Mills – 25 pull requests
  2. Tim Perry – 24 pull requests
  3. David Simons – 17 pull requests
  4. Rob Pethick – 14 pull requests
  5. David Broder-Rodgers – 9 pull requests

Pull Requests

A selection of particularly interesting pull requests within the 161 pull requests we contributed in December:

You can check out the full list of everything we got up to at 24pullrequests.com/organisations/Softwire

Bonus Stats

24 days of pull requests
161 PRs from the Softwire team (so 7 per day, on average)
36 contributors contributing at least one PR (so 4 each, on average)
78 projects contributed to (including 17 PRs to DefinitelyTyped, 14 to Workflowy2Web and 8 to AssertJ & 24PullRequests)

Fantastic work all round!

24 Pull Requests: The Hackathon


18 December 2015, by

We at Softwire care about open-source, and for the last 3 years every December we’ve been very involved in 24 Pull Requests, an initiative to encourage people to give back to the open-source community in the 24 days of advent up to Christmas.

That’s still ongoing (although we’re doing great so far, and watch this space for details once we’re done), but this year we decided to kick it up a notch further, and run an internal full-day hackathon, which we recently ran, on the 7th. This came with free holiday for employees (with 1/2 a day of holiday from our morale budget and 1/2 a day from our holiday charity matching scheme), to make it as easy as possible for people to get involved, and prizes for the first 3 contributions and the best 3 contributions (judged at the end).

The Softwire team enjoy a good breakfast

We kicked the day off with a cooked breakfast buffet at 9, following a brief bit of warm up and setup, and then a team stand-up with the 20 or so participants. Everybody quickly went round and outlined what they were looking at (or what they wanted, if they didn’t have an idea), so people doing similar things could find each other, and people with more enthusiasm than specific plans could find inspiration or other people to pair with and help out. And from there, we’re off!

We then managed to get the first prizes for the 3 pull requests all claimed by 11.30, with Rob Pethick adding in missing resources in Humanizer, Hereward Mills removing some unused code from Cachet, and Dan Corder moving AssertJ from JUnit parameterized tests to JUnitParams (quickly followed by a stream more of PRs coming in just slightly behind).

This pace then accelerated further through the afternoon, as people found projects and got more set up, resulting in a very impressive 46 pull requests total by the end of the day (taking us to a total of 89 for the month so far).

These covered all sorts of projects and languages, and included changes like:

Plus a whole range of other interesting and slightly larger changes, along with the many small typo fixes, documentation updates, setup instruction fixes and so on that are so important in really making open-source projects polished, reliable and usable.

Finally, the most credit must of course go to our winning three pull requests of the day:

  1. Adding some detailed and fiddly tests for various statistical distributions in jStat, and fixing the bugs that they found too, by David Simons
  2. Fixing a tricky Sequelize bug, to support column removal with MSSQL DBs, by Michael Kearns
  3. Improving the accuracy of error highlighting in the output of the TypeScript compiler (despite neither knowing the compiler or even the language beforehand), by Iain Monro

Our team hard at work

Great work all around. We enjoyed an excellent day of open-sourcing, made a lot of progress, and we’re now in a great place to do more and more open-source over the coming weeks, and beyond once 24PRs is complete.

We’re now up to 96 pull requests in total, and counting, and we’re hoping to keep making progress up the 24PRs teams leaderboard as it continues!

Watch this space for more updates on the tools we’ve been working on and interesting changes we’ve been making, or follow us on twitter for extra details.

Typing Lodash in TypeScript, with Generic Union Types


5 November 2015, by

TypeScript is a powerful compile-to-JS language for the browser and node, designed to act as a superset of JavaScript, with optional static type annotations. We’ve written a detailed series of posts on it recently (start here), but in this post I want to talk about some specific open-source work we’ve done with it around Lodash, and some of the interesting details around how the types involved work.

TypeScript

For those of you who haven’t read the whole series: TypeScript gives you a language very similar to JavaScript, but including future features (due to the compile step) such as classes and arrow functions, and support for more structure, with it’s own module system, and optional type annotations. It allows you to annotate variables with these type annotations as you see fit, and then uses an extremely powerful type inference engine to automatically infer types for much of the rest of your code from there, automatically catching whole classes of bugs for you immediately. This is totally optional though, and any variables without types are implicitly assigned the ‘any’ type, opting them out of type checks entirely.

This all works really well, and TypeScript has quite a few fans over here at Softwire. It gets more difficult when you’re using code written outside your project though, as most of the JavaScript ecosystem is written in plain JavaScript, without type annotations. This takes away some of your new exciting benefits; every library object is treated as having ‘any’ type, so all method calls return ‘any’ type, and passing data through other libraries quickly untypes it.

Fortunately the open-source community stepped up and built DefinitelyTyped, a compilation of external type annotations for other existing libraries. These ‘type definitions’ can be dropped into projects alongside the real library code to let you write completely type-safe code, using non-TypeScript libraries.

This is great! Sadly, it’s not that simple in practice. These type definitions need to be maintained, and can sometimes be inaccurate and out of date.

In this article I want to take a look at a particular example of that, around Lodash’s _.flatten() function, and use this to look at some of the more exciting newer features in TypeScript’s type system, and how that can give us types to effectively describe fairly complex APIs with ease.

What’s _.flatten?

Let’s step back. Lodash is a great library that provides utility functions for all sorts of things that are useful in JavaScript, notably including array manipulation.

Flatten is one of these methods. Flattening an array unwraps any arrays that appear nested within it, and includes the values within those nested arrays instead. Flatten also takes an optional second boolean parameter, defining whether this processes should be recursive. An example:

 

_.flatten([1, 2, 3]);                     // returns [1, 2, 3] - does nothing

_.flatten([[1], [2, 3]]);                 // returns [1, 2, 3] - unwraps both inner arrays

_.flatten([[1], [2, 3], 4]);              // returns [1, 2, 3, 4] - unwraps both inner arrays,
                                          // and includes the existing non-list element

_.flatten([[1], [2, 3], [[4, 5]]]);       // returns [1, 2, 3, [4, 5]] - unwraps all arrays,
                                          // but only one level

_.flatten([[1], [2, 3], [[4, 5]]], true); // returns [1, 2, 3, 4, 5] - unwraps all arrays 
                                          // recursively

 

This is frequently very useful, especially in a collection pipeline, and is fairly easy to describe and understand. Sadly it’s not that easy to type, and the previous DefinitelyTyped type definitions didn’t provide static typing over these operations.

What’s wrong with the previous flatten type definitions?

Lots of things! The _.flatten definitions include some method overloads that don’t exist, have some unnecessary duplication, incorrect documentation, but most interestingly their return type isn’t based on their input, which takes away your strict typing. Specifically, the method I’m concerned with has a type definition like the below:

 

interface LoDashStatic {
  flatten<T>(array: List<any>, isDeep?: boolean): List<T>;
}

 

This type says that the flatten method on the LoDashStatic interface (the interface that _ implements) takes a list of anything, and an optional boolean argument, and returns an array of T’s, where T is a generic parameter to flatten. Because T only appears in the output though, not the type of our ‘array’ parameter, this isn’t useful! We can pass a list of numbers, and tell TypeScript we’re expecting a list of strings back, and it won’t know any better.

We can definitely do better than that. Intuitively, you can think of the type of this method as being (for any X, e.g. string, number, or HTMLElement):

 

_.flatten(list of X): returns a list of X
_.flatten(list of lists of X): returns a list of X
_.flatten(list of both X and lists of X): returns a list of X

_.flatten(list of lists of lists of X): returns a list of list of X (unwraps one level)
_.flatten(list of lists of lists of X, true): returns a list of X (unwraps all levels)

 

(Ignoring the case where you pass false as the 2nd argument, just for the moment)

Turning this into a TypeScript type definition is a little more involved, but this gives us a reasonable idea of what’s going on here that we can start with.

How do we describe these types in TypeScript?

Let’s start with our core feature: unwrapping a nested list with _.flatten(list of lists of X). The type of this looks like:

flatten<T>(array: List<List<T>>): List<T>;

Here, we say that when I pass flatten a list that only contains lists, which contain elements of some common type T, then I’ll get back a list containing only type T elements. Thus if I call _.flatten([[1], [2, 3]]), TypeScript knows that the only valid T for this is ‘number’, where the input is List<List<number>>, and the output will therefore definitely be a List<number>, and TypeScript can quickly find your mistake if you try to do stupid things with that.

That’s not sufficient though. This covers the [[1], [2, 3]] case, but not the ultra-simple case ([1, 2, 3]) or the mixed case ([[1], [2, 3], 4]). We need something more general that will let TypeScript automatically know that in all those cases the result will be a List<number>.

Fortunately, union types let us define general structures like that. Union types allow you to say a variable is of either type X or type Y, with syntax like: var myVariable: X|Y;. We can use this to handle the mixed value/lists of values case (and thereby both other single-level cases too), with a type definition like:

flatten<T>(array: List<T | List<T>>): List<T>;

I.e. if given a list of items that are either type T, or lists of type T, then you’ll get a list of T back. Neat! This works very nicely, and for clarity (and because we’re going to reuse it elsewhere) we can refactor it out with a type alias, giving a full implementation like:

 

interface MaybeNestedList<T> extends List<T | List<T>> { }

interface LoDashStatic {
  flatten<T>(array: MaybeNestedList<T>): List<T>;
}

 

Can we describe the recursive flatten type?

That fully solves the one-level case. Now, can we solve the recursive case as well, where we provide a 2nd boolean parameter to optionally apply this process recursively to the list structure?

No, sadly.

Unfortunately, in this case the return type depends not just on the types of the parameters provided, but the actual runtime values. _.flatten(xs, false) is the same as _.flatten(xs), so has the same type as before, but _.flatten(xs, true) has a different return type, and we can’t necessarily know which was called at compile time.

(As an aside: with constant values technically we could know this at compile-time, and TypeScript does actually have support for overloading on constants for strings. Not booleans yet though, although I’ve opened an issue to look into it)

We can get close though. To start with, let’s ignore the false argument case. Can we type a recursive flatten? Our previous MaybeNested type doesn’t work, as it only allows lists of X or lists of lists of X, and we want to allow ‘lists of (X or lists of)* X’ (i.e. any depth of list, with an eventually common contained type). We can do this by defining a type alias similar to MaybeNested, but making it recursive. With that, a basic type definition for this (again, ignoring the isDeep = false case) might look something like:

 

interface RecursiveList<T> extends List<T | RecursiveList<T>> { }

interface LoDashStatic {
  flatten<T>(array: RecursiveList<T>, isDeep: boolean): List<T>;
}

 

Neat, we can write optionally recursive type definitions! Even better, the TypeScript inference engine is capable of working out what this means, and inferring the types for this (well, sometimes. It may be ambiguous, in which case we’ll have to explicitly specify T, although that is then checked to guarantee it’s a valid candidate).

Unfortunately when we pass isDeep = false, this isn’t correct: _.flatten([[[1]]], false) would be expected to potentially return a List<number>, but because it’s not recursive it’ll actually always return [[1]] (a List<List<number>>).

Union types save the day again though. Let’s make this a little more general (at the cost of being a little less specific):

flatten<T>(array: RecursiveList<T>, isDeep: boolean): List<T> | RecursiveList<T>;

We can make the return type more general, to include both potential cases explicitly. Either we’re returning a totally unwrapped list of T, or we’re returning list that contains at least one more level of nesting (conveniently, this has the same definition as our recursive list input). This is actually a little dupicative, List<T> is a RecursiveList<T>, but including both definitions is a bit clearer, to my eye. This isn’t quite as specific as we’d like, but it is now always correct, and still much closer than the original type (where we essentially had to blind cast things, or accept any-types everywhere).

Putting all this together

These two types together allow us to replace the original definition. We can be extra specific and include both by removing the optional parameter from the original type definition, and instead including two separate definitions, as we can be more specific about the case where the boolean parameter is omitted. Wrapping all that up, this takes us from our original definition:

 

interface LoDashStatic {
  flatten<T>(array: List<any>, isDeep?: boolean): List<T>;
}

 

to our new, improved, and more typesafe definition:

 

interface RecursiveList<T> extends List<T | RecursiveList<T>> { }
interface MaybeNestedList<T> extends List<T | List<T>> { }

interface LoDashStatic {
  flatten<T>(array: MaybeNestedList<T>): List<T>;
  flatten<T>(array: RecursiveList<T>, isDeep: boolean): List<T> | RecursiveList<T>;
}

 

You can play around with this for yourself, and examine the errors and the compiler output, using the TypeScript Playground.

We’ve submitted this back to the DefinitelyTyped project (with other related changes), in https://github.com/borisyankov/DefinitelyTyped/pull/4791, and this has now been merged, fixing this for Lodash lovers everywhere!

OpenStack – An overview


16 October 2015, by

OpenStack Logo

What is it

OpenStack is open source virtualisation software that can be run on generic hardware, allowing you to build your own cloud. In order to provide high availability, several servers can be clustered together. This allows resources from several servers to be pooled into one place when deploying machines.

(more…)

Open-Source at Softwire


4 September 2015, by

Here at Softwire, we’re pretty keen on open-source. A huge amount of the work we do day to day depends on the community-developed tools that power the ecosystems around languages like Java, C# and JavaScript.

Sometime it’s nice to give back. There’s the lovely warm feeling of Doing Good, but then we’ve found it’s also a great way to improve your knowledge of your tools, train your development skills outside of normal project work and give ourselves (and everybody else) better tools to use in future. It feels good to be able to look back at the improvements you’ve made to, and to reflect on work we’ve been doing recently.

That’s what I want to do with this post: highlight some of the great contributions Softwire people have made back to the open-source community so far this year, and talk about options for finding things to help out with and getting more involved (both for us, and to encourage any of you who feel keen after reading this!).

Recent open-source contributions we’re proud of:

Want to get involved?

Inspired by any of the above, and interested in trying your hand at a bit of open-source contribution yourself? We’ve put together a list of useful ways we’ve found to find and dig into interesting projects:

Help write Firefox’s Dev Tools

Mostly just needs JS/HTML/CSS skills. firefox-dev.tools has a list of bugs for total beginners to get started with, including a filter to find ‘mentored’ bugs, where there’s somebody assigned to help whoever picks the bug up first get set up and going. wiki.mozilla.org/DevTools/GetInvolved has more details.

Interested in helping out with other Mozilla open-source projects (Firefox OS, Rust, Servo, Firefox itself)? Take a look through whatcanidoformozilla.org.

Subscribe to interesting projects on CodeTriage

CodeTriage lets you search projects by their number of open issues and language, e.g. Java, C# or JavaScript. Get emailed bugs in projects you subscribe to, and have a go at fixing any that sound interesting.

Help write better documentation

DocsDoctor will send you bits of documentation from projects you subscribe to; take a look through them, thing about whether they could be improved or clarified, and put in a quick patch to make everything more understandable for everybody.

For the more hardcore, you can also sign up to get totally sent undocumented methods and classes, and try and put together some useful notes on what they’re for and how to use them.

Make the tools you use day to day better for everybody

What libraries are you using at the moment? Do they work perfectly? Anything in the API that’s confusing, any poor documentation, or weird behaviour you have to work around? Send them a quick bug on Github, or have a look at fixing it yourself, and make your life (and everybody else’s) easier forevermore.

Come work for Softwire

We love this stuff, and we’re hiring.

 

Fixing aggregation over relationships with Sequelize


17 July 2015, by

I’ve been setting up Sequelize recently for a new Softwire Node.js project that’s starting up. As part of the initial work we wanted to investigate Sequelize (the go-to Node SQL ORM) in a little depth, to make sure it could neatly handle some on the trickier operations we wanted to perform, without us having to fall back to raw SQL all the time.

Most of these came out very easily in the wash, but one was trickier and needed investigation and upstream patches, and I’d like to take a closer look at that in this post. The challenging case: aggregating values across a relationship (i.e. SUM over a column from a JOIN).

Some Background

The project is sadly confidential, but the core operation has an easy equivalent in the classic Blog model of Posts and Comments. We have lots of Posts on our blog, and each Post has 0 or more Comments. For the purposes of this example, Comments can have some number of likes. Defining this model in Sequelize looks something like:

var Sequelize = require('sequelize');
var db = require('./db-connection');

var Post = db.define('Post', {
    publishDate: { type: Sequelize.DATEONLY }
});
var Comment = db.define('Comment', {
    likes: { type: Sequelize.INTEGER }
});

Post.hasMany(Comment);

With this model in place, Sequelize makes it easy for us to do some basic querying operations, using its Promise-based API:

db.sync().then(function () {
    return Post.findAll({
        // 'Include' joins the Post table with the Comments table, to load both together
        include: [ { model: Comment } ]
    });
}).then(function (results) {
    // Logs all Posts, each with a 'Comments' field containing a nested array of their related Comments
    console.log(JSON.stringify(results));
}).catch(console.error);

The Problem

Given this Post/Comment model, we want to get the total number of likes across all comments for matched set of articles (‘how many likes did we get in total for this month’s articles?’). A great result would be a SQL query like:

SELECT SUM(comment.likes) AS totalLikes
FROM dbo.Posts AS post
LEFT OUTER JOIN dbo.Comments AS comment ON post.id = comment.postId
WHERE post.publishDate >= '2015-05-01'

Sequelize in principle supports queries like this. It allows an ‘include’ option (in the example above), a ‘where’ option (for filtering) and an ‘attributes’ option (specifying the fields to return). There is also Sequelize.fn, to call SQL functions as part of expressions (such as the attributes we want returned). Combining all of these together suggests we can build the above with something like:

db.sync().then(function () {
    return Post.findAll({
        // Join with 'Comment', but don't actually return the comments
        include: [ { model: Comment, attributes: [] } ],
        // Return SUM(Comment.likes) as the only attribute
        attributes: [[db.fn('SUM', db.col('Comments.likes')), 'totalLikes']],
        // Filter on publishDate
        where: { publishDate: { gte: new Date('2015-05-01') } }
    });
}).then(function (result) {
    console.log(JSON.stringify(result));
}).catch(console.error);

Sadly this doesn’t work. It instead prints “Column ‘Posts.id’ is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause”, because the generated SQL looks like:

SELECT Post.id, SUM(Comments.likes) AS totalLikes
FROM Posts AS Post
LEFT OUTER JOIN Comments AS Comments ON Post.id = Comments.PostId
WHERE Post.publishDate >= '2015-05-01 00:00:00.000 +00:00';

This is wrong! This attempts to load the total aggregate result, but also to load it along with all the Post ids involved, which isn’t really meaningful in SQL, unfortunately. If we group by Post.id this will work (and that is possible in Sequelize), but in reality there are a large number of Posts here, and we’d just like a single total, not a total per-Post that we have to load and add up later ourselves.

Making this Work

Unfortunately it turns out that there is no easy way to do this in the Sequelize, without getting involved in the internals. Fortunately it’s open-source, so we can do exactly that.

The real problem here is that the ‘attributes’ array we provide isn’t being honoured, and Posts.id is being added to it. After a quick bit of analysis tracing back where ‘attributes’ get changed, it turns out the cause of this is inside findAll, in Sequelize’s model.js. Take a look at the specific code in lib/model.js lines 1176-1187. This code ensures that if you ever use an ‘include’ (JOIN), you must always return the primary key in your results, even if you explicitly set ‘attributes’ to not do that. Not helpful.

The reason this exists is to ensure that Sequelize can internally interpret these results when building models from them, to reliably deduplicate when the same post comes back twice with two different joined comments, for example. That’s not something we need here though, as we’re just trying to load an aggregate and we don’t want populated ‘Post’ models back from this, and it causes a fairly annoying problem (for us and various others). There is a ‘raw’ option that disables building a model from these results, but that sadly doesn’t make any differences to the behaviour here.

In the short-term, Sequelize has ‘hooks’ functionality that lets you tie your own code into its query generation. Using that, we can put together a very simple workaround by changing our connection setup code to look something like the below (and this is what we’ve done, for the very short-term).

function resetAttributes(options) {
    if (options.originalAttributes !== undefined) {
        options.attributes = options.originalAttributes;
        if (options.include) {
            options.include.forEach(resetAttributes);
        }
    }
}

var db = new Sequelize(db, username, password, {
    "hooks": {
        "beforeFindAfterOptions": function (options) {
            if (options.raw) resetAttributes(options);
        }
    }
}

If you’re in this situation right now, the above will fix it. It changes query generation to drop all ‘attributes’ overrides if ‘raw’ is set on the query, solving this issue, so that running the aggregation query above with ‘raw: true’ then works. Magic.

Solving this Permanently

That’s all very well for now, but it feels like a bit of a hack, and this behaviour seems like something that’s not desirable for Sequelize generally anyway.

Fortunately, we’ve now fixed it for you, in a pull request up at https://github.com/sequelize/sequelize/pull/4029.

This PR solves this issue properly, updating the internals of model.js to not change the specified attributes if it’s not necessary (if ‘raw’ is set to true) both for this case (attributes on the query root), and the include case (the attributes of your JOINed models). That PR’s recently been merged, solving this issue long-term in a cleaner way, and should be available from the next Sequelize release.

Once that’s in place, this blog post becomes much shorter: if you want to aggregate over an include in Sequelize, add ‘raw: true’ to your query. Phew!

24 Pull Requests 2014


20 February 2015, by

As part of our ongoing involvement in the open-source community we here at Softwire spent December getting involved in 24 Pull Requests, an initiative encouraging developers to contribute pull requests to projects they use and love through the run up to Christmas.

It’s always fun getting involved in development on new projects like this, but to add to that we’ve taken to running this as an internal competition, with a £50 prize for the winner and a £10 runner-up prize, plus two £20 prizes for some ‘pull request races’ en route; where a specific easy-to-fix issue in a project is announced, and the first person to put a pull request in fixing it wins.

Tallying up our results, we’ve ended up with some great contributions to a huge variety of projects here, across the team. First, the leaderboard:

Final leaderboard of pull requests:

Jamie Humphries – 9
Tim Perry – 9
Hereward Mills – 4
Dan Corder – 3
Dave Simons – 3
Ed Wagstaff – 2
Harry Cummings – 2
Andy Patterson – 1

Bonus stats:

24 days of pull requests
32 pull requests sent in total
9 people contributed at least one
21 projects got a PR from us, with the big winners being AssertJ, ScriptCS,and Mockito (in that order)
66 commits in total, across all PRs
5 languages – Java, C#, JavaScript, PHP, Ruby (in that order, approximately)
186th out of 7000 organisations – ahead of development teams including the Django, Python and MongoDB, and companies like Twitter, SoundCloud and Heroku.