Blog

● posted by Henrik Joreteg

TodoMVC is a neat project with a simple idea: to build the same application with a whole slew of different frameworks so we can compare how they solve the same problems.

The project’s maintainers asked me to contribute an example in Ampersand.js, so that’s what we did.

There are a few aspects of the implementation that I thought were worth writing up.

First, some highlights

  1. filesize: Total filesize of all the JS assets required for the app only 24kb(minified/gzipped) which, for comparison, is smaller than jQuery by itself. By comparison the Ember.js version is 165kb and that’s without including compiled templates.

  2. super efficient DOM updating: It’s worth noting that after the initial render, zero changes to state result in DOM updates, unless there’s actually a change that needs to be made in the DOM. Except for a single, minor exception (described later) DOM updates are done via specific DOM methods such as .setAttribute, .innerText, .classList.add, etc. and not .innerHTML. This matters because innerHTML is generally slower, because it requires the browser to parse HTML. This means that after the initial render, it does the absolute minimum number of DOM updates required, and as efficiently as possible.

  3. good code hygiene: Maintainable, readable code. All state stored in models, zero state stored in the DOM. Fully valid HTML (I’m looking at you, Angular). Call me old school, but behavior is in JS, styles is in CSS, and structure is in HTML.

  4. fully template language agnostic: We’re using jade here, but it really doesn’t matter because the bindings are all handled outside of the templating, as we’ll see later. You could easily use the template language of your choice, or even plain HTML strings.

Ok, now let’s get into some more of the details.

Persisting todos to localStorage

The TodoMVC project’s app spec specifies:

Your app should dynamically persist the todos to localStorage. If the framework has capabilities for persisting data (i.e. Backbone.sync), use that, otherwise use vanilla localStorage. If possible, use the keys id, title, completed for each item. Make sure to use this format for the localStorage name: todos-[framework]. Editing mode should not be persisted.

This is ridiculously easy in Ampersand. This could be done this as a mixin, so we could use the “backbone-esque” .save() methods on the models. But given how straightforward this use case is, it’s simpler to just do it directly. We simply create two methods.

One to write the data to localStorage:

writeToLocalStorage: function () {
  localStorage[STORAGE_KEY] = JSON.stringify(this);
}

One to retrieve it:

readFromLocalStorage: function () {
  var existingData = localStorage[STORAGE_KEY];
  if (existingData) {
    this.set(JSON.parse(existingData));
  }
}

You’ll notice we’re just passing this to JSON.stringify. This works because ampersand collection has a toJSON() method and the spec for the browser’s built-in JSON interface states that it will look for and call a toJSON method on the object passed in, if present. So rather than doing JSON.stringify(this.toJSON()), we can just do JSON.stringify(this). Ampersand collection’s toJSON is simply an alias to serialize which loops through the models it contains and calls each of their serialize methods and returns them all as a serializable array.

So far we’ve just created the methods and not actually used them, so how do we wire that up?

Well, given how simple the app requirements are in this case: “save everything when stuff changes,” we can just have the collection watch itself and persist when it changes. Then our models don’t even have to know or care how they get persisted, the collection will watch itself, and whenever we add/remove or change something it’ll re-save itself. This keeps our logic nicely encapsulated into the collection whose responsibility it is to deal with models. Makes sense, right?

Turns out, that’s quite easy too. Inside our collection’s initialize method, we’ll do as follows. See line comments below:

initialize: function () {
  // Attempt to read from localStorage right away
  // this also adds them to the collection
  this.readFromLocalStorage();

  // We put a slight debounce on this since it could possibly
  // be called in rapid succession. We're using a small npm package
  // called 'debounce' for this: 
  // https://www.npmjs.org/package/debounce
  this.writeToLocalStorage = debounce(this.writeToLocalStorage, 100);

  // We listen for changes to the collection
  // and persist on change
  this.on('all', this.writeToLocalStorage, this);
}

Syncing between multiple open tabs

Even though it’s not specified in the spec, we went ahead and handled the case where you’ve got the app open in multiple tabs in the same browser. In most of the other implementations, this case isn’t covered, but it feels like it should be. Turns out, this is quite simple as well.

We simply add the following line to our initialize method in our collection, which listens for storage events from the window:

window.addEventListener('storage', this.handleStorageEvent.bind(this));

The corresponding handler inside our collection looks like this:

handleStorageEvent: function (event) {
  if (event.key === STORAGE_KEY) {
    this.readFromLocalStorage();
  }
}

The event argument passed to our storage event handler will includes a key property which we can use to determine which localStorage value changed. These storage events don’t fire in the tab that caused them, and they only fire in other tabs if the data is actually different. This seems perfect for our case. So we simply check to see if the change was to the key we’re storing to, run readFromLocalStorage, and we’re good.

That’s it! Here’s the final collection code.

note: It’s worth noting that the app spec for TodoMVC is a bit contrived (understandably). If you’re going to use localStorage in a real app you should beware that it is shared by all open tabs of your app, as well as the fact that your data schema may change in a future version. To address these issues, consider namespacing your localStorage keys with a version number to avoid conflicts. While all these problems can be solved, in most production cases you probably shouldn’t treat localStorage as anything other than an somewhat untrustworthy cache. If it uses it to store something important and the user clears their browser data, it’s all gone. Also, you can’t always trust that you’ll get valid JSON back, so a try/catch would probably be wise as well.

Session properties for editing state

If you paid close attention, you noticed the TodoMVC application spec also says we shouldn’t persist the editing state of a todo. This refers to the fact that you can double-click a task to put it into edit mode.

One thing that’s a bit unique in Ampersand is its use of what we call “session properties” to store things like the editing state.

If you look at the other examples, both Ember and Backbone only reference the “editing” state in the view code or the view controller; there’s no reference to it in the models. Compare that to our todo model:

var State = require('ampersand-state');


module.exports = State.extend({
  // These properties get persisted to localStorage
  // because they'll be included when serializing
  props: {
    title: {
      type: 'string',
      default: ''
    },
    completed: {
      type: 'boolean',
      default: false
    }
  },

  // session properties are *identical* to `props`
  // *except* that they're not included when serializing.
  session: {
    // here we declare that editing state, just like
    // the properties above.
    editing: {
      type: 'boolean',
      default: false
    }
  },

  // This is just a convenience method that just
  // gives our view a simple method to call when 
  // it wants to trash a todo.
  destroy: function () {
    if (this.collection) {
      this.collection.remove(this);
    }
  }
});

You might be thinking, WHAT!? You’re storing view state in the models?!

Yes. Well… sort of.

If you think about it, is it really view state? I’d argue it’s “application state,” or really “session state” that’s very clearly tied to that particular model instance.

Conceptually, at least to me, it’s clear that it’s actually a state of the model. The view is not in “editing” mode; the model is.

How the view or the rest of the app deals with that information is irrelevant. The fact is, when a user edits a todo, they have put that particular todo into an editing state. That has nothing to do with a particular view of that model.

This distinction becomes even more apparent if your app needs to do something else based on that state information, such as disabling application-wide keyboard shortcuts, or applying a class to the todo-list container element when it’s in edit mode.

Even if you disagree with that, what about readability? Let’s say you’re working with a team on this app, where can they go to see all the state we’re storing related to a single todo?

In the Backbone.js example the model code reads like this:

app.Todo = Backbone.Model.extend({
  // Default attributes for the todo
  // and ensure that each todo created has `title` and `completed` keys.
  defaults: {
    title: ',
    completed: false
  },

  // Toggle the `completed` state of this todo item.
  toggle: function () {
    this.save({
      completed: !this.get('completed')
    });
  }
});

and in Ember:

Todos.Todo = DS.Model.extend({
  title: DS.attr('string'),
  isCompleted: DS.attr('boolean')
});

Neither of these give any indication that we also care about whether a model is in editing mode or not. We’d have to dig into the view to see that. In an app this simple, it’s not a big deal. In a big app this kind of thing gets problematic very quickly.

It feels so much clearer to see all the types of state related to that model in a single place.

Using a subcollection to get filtered views of the todos

The spec says we should have 3 different view modes for our todos:

  1. All todos
  2. Remaining todos
  3. Completed todos

There are a few different ways we could go about this. We’ve got our trusty ampersand-collection-view which will take a collection and render a view for each item in the collection. It also takes care of adding and removing items if the collection changes, as well as cleaning up event handlers if the parent view is destroyed.

That collection view is included in ampersand-view and is exposed as a simple method: renderCollection.

One way to accomplish what’s being asked in the spec would be to create three different collections and shuffle todos around between collections based on their completed state. But that feels a bit weird. Because we really only have one item type. We could also have a single base collection and request a new filtered list of todos from that collection each time any of them changes, which is how the Backbone.js implementation does it. But that would mean that it’s no longer just a rendered collection. Instead we’d have to re-render a view for each todo in the matching set, which doesn’t feel very clean or efficient.

It seems cleaner/easier to just have a single todos collection and then render a “filtered view,” if you will. Ideally, we’d just be able to set a mode of that filtered view and have it add/remove as necessary.

So we want something that behaves like a normal collection, but which is really just a subset of that collection.

Then we could still just call renderCollection once, using that subcollection.

Then if we change the filtering rules of the subcollection things would Just Work™. In ampersand we’ve got just such a thing in ampersand-subcollection.

If you give it collection to use as a base and a set of rules like filters, a max length, or its own sorting order, then it pretends to be a “real” collection. It has a models array of its current models, a length property, its own comparator, and it will fire events like add/remove/change/sort as the underlying data in the base collection changes, but it will fire those events based on its own defined filters and rules.

So, let’s use that. In this case we just need a single subcollection, so we’ll just create it and attach it to the collection as part of its initialize method:

var Collection = require('ampersand-collection');
var SubCollection = require('ampersand-subcollection');
var Todo = require('./todo');


module.exports = Collection.extend({
  model: Todo,
  initialize: function () {
    ...
    // This is what we'll actually render
    // it's a subcollection of the whole todo collection
    // that we'll add/remove filters to accordingly.
    this.subset = new SubCollection(this);
    ...
  },
  ...
}

Now, rather than just rendering our collection, in our main view we’ll render the subcollection instead:

this.renderCollection(app.me.todos.subset, TodoView, this.queryByHook('todo-container'));

We’ll talk about model structure in just a minute, but for now let’s just realize that app.model.todos is our todos collection and app.model.todos.subset was the subcollection we just created above.

The TodoView is the constructor (a.k.a. view class) for the view we want to use to render the items in the collection and this.queryByHook('todo-container') will return the DOM element we want to render these into. If you’re curious about queryByHook, see this explanation of why we use data-hook.

So, now we can just re-configure that subcollection and it will fire add/remove events for changes based on those filters and our collection renderer will update accordingly.

There are three valid states for the view mode we’re in. It can be "active", "completed", or "all". So now we create a simple helper method on the collection that configures it based on the mode:

setMode: function (mode) {
  if (mode === 'all') {
    this.subset.clearFilters();
  } else {
    this.subset.configure({
      where: {
        completed: mode === 'completed'
      }
    }, true);
  }
}

So where does that mode come from? Let’s look at our model structure.

Modeling state

In Ampersand a common pattern is to create a me model to represent state for the user of the app. If the user is logged in and has a username or other attributes, we’d store them as props on the me model. In this app, there’s no persisted me properties, but we do still have a user of the app we want to model and that user has a set of todos that belong to them. So we’ll create that as a collection property on the me object like so:

var State = require('ampersand-state');
var Todos = require('./todos');


module.exports = State.extend({
  ...
  collections: {
    todos: Todos
  },
  ...  
});

Things that otherwise represent “session state” or other cached data related to the user can be attached to the me model as session properties as we described above.

Something like the mode we described above fits into that category.

Ideally, we should be able to simply change the mode on the me model and everything else should just happen.

And, since we’re using ampersand-state, we can change the entire mode of the app with a simple assignment, as follows:

app.me.mode = 'all';

Go ahead and open a console on the app page and try setting it to various things. Note that it will only let you set it to a valid value. If you try doing: app.me.mode = 'garbage' you’ll get this error:

type error

This type of defensive programming is hugely helpful for catching errors in other parts of your app.

This works because we’ve defined mode as a session property on our me model like this:

mode: {
  type: 'string',
  values: [
    'all',
    'completed',
    'active'
  ],
  default: 'all'
}

It’s readable and behaves as you’d expect.

Calculating various lengths/totals

The app spec states we must show counts of “items left” and “items completed,” plus we have to be able to know if there aren’t any items at all in the collection so we can hide the header and footer.

This means we need to track 3 different calculated totals at all times.

Ultimately if this is state we care about, we want them to be easily readable as part of a model definition. Since we have a me model that contains the mode and has a child collection of todos, it makes sense for it to care about and track those totals. So we’ll create session properties for all of those totals too.

In the me model’s initialize we can listen to events in our collection that we know will affect these totals, and then we have a single method handleTodosUpdate that calculates and sets those totals.

The totals are quite easy; we check todos.length for totalCount, loop through once to calculate how many items are completed for completedCount, then use simple arithmetic for activeCount.

Just for clarity, we also then set a boolean value for whether all of them are completed or not. This is because the spec states that if you go through and check all the items in the list, that the “check all” checkbox at the top should also check itself. Tracking that state as a separate boolean makes it nice and clear.

So, now our me models looks something like this:

...
initialize: function () {
  // Listen to changes to the todos collection that will
  // affect lengths we want to calculate.
  this.listenTo(this.todos, 'change:completed change:title add remove', this.handleTodosUpdate);

  // We also want to calculate these values once on init
  this.handleTodosUpdate();
  ...
},
// Define our session properties
session: {
  activeCount: {
    type: 'number',
    default: 0
  },
  completedCount: {
    type: 'number',
    default: 0
  },
  totalCount:{
    type: 'number',
    default: 0
  },
  allCompleted: {
    type: 'boolean',
    default: false
  },
  mode: {
    type: 'string',
    values: [
      'all',
      'completed',
      'active'
    ],
    default: 'all'
  }
},
// Calculate and set various lengths we're
// tracking. We set them as session properties
// so they're easy to listen to and bind to DOM
// where needed.
handleTodosUpdate: function () {
  var completed = 0;
  var todos = this.todos;
  todos.each(function (todo) {
    if (todo.completed) {
      completed++;
    }
  });
  // Here we set all our session properties
  this.set({
    completedCount: completed,
    activeCount: todos.length - completed,
    totalCount: todos.length,
    allCompleted: todos.length === completed
  });
},
...

At this point we have all the state we want to track for the entire app. None of it is mixed into any of the view logic. We’ve got an entire completely de-coupled data layer that tracks all state for the app.

You can see the me model in its entirety as currently deployed on github.

Routing

Once we’ve done all of this state management, the router becomes super simple.

We’ve already created a mode flag on the me that actually controls everything.

So all we have to do is set the proper mode based on the URL, which we can do like so:


var Router = require('ampersand-router');


module.exports = Router.extend({
  routes: {
    // this matches all urls
    '*filter': 'setFilter'
  },
  setFilter: function (arg) {
    // if we passed one, set it
    // if not set it to "all"
    app.me.mode = arg || 'all';
  }
});

Views

At this point it’s really all a matter of wiring things up to the views. The views contain very little actual logic. They simply declare how things should be rendered, what data should be bound where, and turn user actions into changes in our state layer.

For this app, the index.html file contains the layout HTML already. So the main view is just going to attach itself to the <body> tag as you can see in our app.js file, below. We simply hand it the existing document.body and never call render() because it’s already there.

var MainView = require('./views/main');
var Me = require('./models/me');
var Router = require('./router');


window.app = {
  init: function () {
    // Model representing state for
    // user using the app. Calling it
    // 'me' is a bit of convention but
    // it's basically 'app state'.
    this.me = new Me();

    // Our main view
    this.view = new MainView({
      el: document.body,
      model: this.me
    });

    // Create and fire up the router
    this.router = new Router();
    this.router.history.start();
  }

};

window.app.init();

The views in this particular app handle all bindings declaratively as described by the bindings property of the views. It might feel a tad verbose, but it’s also very precise. This way you, as the developer, can decide whether you want to just render things into the template on first render, or whether you want to bind things. It’s also useful for publishing re-usable views. Because you don’t have to include any templating library as part of your re-usable views.

Templates and views are easily the most debate-inducing portion of modern JS apps, but the main point is that Ampersand.js gives you an agnostic way of doing data binding that’s there if you want it, but completely gets out of your way if you’d rather use something like Handlebars or React to handle your view layer.

That’s the whole point of the modular architecture of Ampersand.js: optimize for flexibility, install only what you want to use.

For a full reference of all the data binding types you can use, see the reference documentation.

Below are the declarative bindings from the main view with comments describing what each does.

Note that model in this case is the me model. So model.totalCount, for example, is referencing the me.totalCount session property discussed above. If you really prefer tracking state in your view code, it’s easy to do so. Simply add a props or session properties just like you would in a model, and everything still works.

It’s worth noting that with the way we’ve declared bindings in the app they still work if you replaced this.el, or if this.model was changed or didn’t exist at the time of first render. They would still be set and updated accordingly.

Many times in real apps, these binding declarations are simpler than this example, but on the plus side it serves as a good demo of the types of bindings that are available. Here’s the data binding section from our js/views/main.js view:


...

bindings: {
  // Toggles visibility of main and footer
  // based on truthiness of totalCount.
  // Since zero is falsy it won't show if
  // total is zero.
  'model.totalCount': {
    // this is the binding type
    type: 'toggle',
    // this is just a CSS selector
    selector: '#main, #footer'
  },
  // This is how you do multiple bindings
  // to a single property. Just pass an 
  // array of bindings.
  'model.completedCount': [
    // Hides the clear-completed span
    // when there are no completed items
    {
      type: 'toggle',
      // "hook" here is shortcut for 
      // selector: '[data-hook=clear-completed]'
      hook: 'clear-completed'
    },
    // Inserts completed count as text
    // into the span
    {
      type: 'text',
      hook: 'completed-count'
    }
  ],
  // This is an HTML string that we made
  // as a derived (a.k.a. computed) property
  // of the `me` model. This was done this way
  // for simplicity because the target HTML
  // looks like this: 
  // "<strong>5</strong> items left"
  // where "items" has to be correctly pluralized
  // since it's not just text, but not really
  // a bunch of nested HTML it was easier to just
  // bind this as `innerHTML`.
  'model.itemsLeftHtml': {
    type: 'innerHTML',
    hook: 'todo-count'
  },
  // This adds the 'selected' class to the right
  // element in the footer
  'model.mode': {
    type: 'switchClass',
    name: 'selected',
    cases: {
      'all': '[data-hook=all-mode]',
      'active': '[data-hook=active-mode]',
      'completed': '[data-hook=completed-mode]',
    }
  },
  // Bind 'checked' state of `mark-all`
  // checkbox at the top
  'model.allCompleted': {
    type: 'booleanAttribute',
    name: 'checked',
    hook: 'mark-all'
  }
},
...

A few closing thoughts

I’m excited that we were asked to contribute an example to TodoMVC. Big thanks to Luke Karrys, Philip Roberts and Gar for their help/feedback on building the app and to Sindre Sordhus, Addy Osmani, and Pascal Hartig for their hard work on the TodoMVC project, as it’s quite useful for comparing available tools.

If you have any feedback ping me, @HenrikJoreteg on twitter or any of the other core contributors for that matter. You can also jump into the #&yet IRC channel on freenode and tell us what you think. We’re always working to improve.

We think we’ve created something that strikes a good balance between flexibility, expressiveness, readability, and power, and we’re thrilled about the fast adoption and massive community contribution we’ve seen in just a few short months since releasing Ampersand.js.

I’ll be speaking about frameworks in Brighton at FullFrontal in a few weeks, and then about Ampersand.js at BackboneConf in December. Hope to see you then.

If you like the philosophy and approaches described here, you might also enjoy my book, Human JavaScript. If you want Ampersand.js training for your team get in touch with our training coordinator.

See you on the Interwebz! <3

● posted by Peter Saint-Andre

Two of our core values on the &yet team are curiosity and generosity. That’s why you’ll so often find my yeti colleagues at the forefront of various open-source projects, and also sharing their knowledge at technology and design conferences around the world.

An outstanding example is the work that Philip Roberts has done to understand how JavaScript really works within the browser, and to explain what he has discovered in a talk entitled “What the Heck is the Event Loop, Anyway?” (delivered at both ScotlandJS and JSConf EU in recent months).

If you’d like to know more about the inner workings of JavaScript, I highly recommend that you spend 30 minutes watching this video - it is fascinating and educational and entertaining all at the same time. (Because another yeti value is humility, you won’t find Philip boasting about this talk, but I have no such reservations because it is seriously great stuff.)

What the Heck is the Event Loop, Anyway?

● posted by Adam Brault

&yet has long been a wandering band of souls—like a mix of the A-Team and the Island of Misfit Toys from Rudolph the Red-Nosed Reindeer.

Over five years, one thing that’s been a constant for &yet is realtime. We’ve worked our way through many technologies—some ancient, some nascent, and many of our own. We’ve stayed focused on the users of those technologies—end users and developers.

Our path forward has become clearer and more focused than it’s ever been. Some of the terrific people we’ve added to our team this last year have had tremendous influence on honing our focus.

We know the type of company we aspire to be: people first, and always determined to make things better for humans on all sides of our work.

We know the type of experiences we aim to create: friendly, user-focused, and simple.

We know the types of problems we want to solve: we want to empower realtime software and the teams who build it, and we want to create a huge wake of open source that helps make communication technologies as open and accessible as the web itself.

We have a clearer sense of focus than we’ve ever had. But for us, there’s one missing component.

You.

In order for us to get where we want to go, we need you. The communities we are part of have shaped &yet in instrumental ways.

Blog posts are great, but they’re not enough. We don’t believe in blog post comments—we believe in conversations.

We want to invite our community to be a more intimate part of what we’re doing here, and we want to use email as a way to share the things we’re working on and learning and wondering about.

To that end, we are launching &you, a mailing list with bi-weekly dispatches covering all the things &yet does and has learned, and we want you to be part of it. Your aspirations, your questions, your problems, your frustrations, your wishes, your hopes. What do you need? How can we help?

We’ll get into more of that as we go, but first, we need your name and email address to get the ball rolling.

Join us. We need you!

● posted by Nathan Fritz

A web application is not the same as the service it uses, even if you wrote them both. If your service has an API, you should not make assumptions about how the client application is going to use the API or the data.

A successful API will likely have more than one client written for it, and the authors of those clients will have very different ideas about how to use the data. Maybe you’ll consume the API internally for other uses as well. You can’t predict the future, so part of separating your API concerns from your clients should be feature detection.

For a real time application, feature detection is a great way to manage client subscriptions to data.

As I discussed a few weeks ago, for realtime apps it’s better to send hints, not data. When a client deliberately subscribes to a data channel for updates on changes, that is an explicit subscription. By contrast, an implicit subscription occurs when a client advertises the features and data types it is capable of dealing with, and the server automatically subscribes it to the relevant data channels.

Implicit subscriptions give the client more flexibility in getting data and give the service more flexibility in managing the channels over which it pushes data. Implicit subscriptions can also be affected by relationships with and the availability of other users.

For example, in a collaboration application, I may be in a group. My client might support displaying the geographical location of users, so my client publishes a feature URI (say, http://jabber.org/protocol/geoloc) to the service.
Since my client is a member of a group, the service might subscribe me to the geolocation of other users in the group, such as /adam/geolocation and /bill/geolocation.

Using implicit subscriptions here saves the client from the hassle of managing all of these subscriptions when various events occur (when I first log in, when other users log in and out, when users are added and removed from the group, etc.).

This keeps the concerns of the service separate from the concerns of the client. The service knows and cares which data feeds it is sending to the client, but the client doesn’t care. The client only cares that the service sends the data types it supports and wants. The client doesn’t need to know how these data feeds are organized at all, and so it shouldn’t be up to the client to manage its subscriptions to them.

In addition to subscriptions, feature publishing can affect the results I get back when I make an API call. When a client explicitly retrieve data about /users/bill, the service could send items related to Bill that my client has declared support/interest in. In the spirit of hypermedia APIs, the service could also suggest URIs of related information with the results of a query.

Managing subscriptions for realtime web applications can be tedious. Using implicit subscriptions - by creatively combining feature detection, API calls, user presence and relationships, and recent queries - can give web apps and their authors a lot of flexibility without the tedium.


Would you like our help with your architecture? Please contact us, we’d be glad to help!

● posted by Bear

Every Operations Team needs to maintain the system packages installed on their servers. There are various paths toward that goal, with one extreme being to track the packages manually - a tedious, soul-crushing endeavor even if you automate it using Puppet, Fabric, Chef, or (our favorite at &yet) Ansible.

Why? Because even when you automate, you have to be aware of what packages need to be updated. Automating “apt-get upgrade” will work, yes - but you won’t discover any regression issues (and related surprises) until the next time you cycle an app or service.

A more balanced approach is to automate the tedious aspects and let the Operations Team handle the parts that require a purposeful decision. How the upgrade step is performed, via automation or manually, is beyond the scope of this brief post. Instead, I’ll focus on the first step: how to gather data that can be used to make the required decisions.

Gathering Data

The first step is to find out what packages need to be updated. To do that we will use the operating system’s package manager. For the purposes of this post I’ll use the apt utility for Debian/Ubuntu and yum for RedHat/Centos.

apt-get -s dist-upgrade
yum list updates

Apt will return output that looks like this:

Reading package lists...
Building dependency tree...
Reading state information...
The following NEW packages will be installed:
  libxfixes-dev
The following packages will be upgraded:
  base-files openssl tzdata
3 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Inst base-files [6.5ubuntu6.7] (6.5ubuntu6.8 Ubuntu:12.04/precise-updates [amd64])
Conf base-files (6.5ubuntu6.8 Ubuntu:12.04/precise-updates [amd64])
Inst tzdata [2014c-0ubuntu0.12.04] (2014e-0ubuntu0.12.04 Ubunt
Inst openssl [1.0.1-4ubuntu5.14] (1.0.1-4ubuntu5.17 Ubuntu:12.04/precise-security [amd64])

Yum will return output that contains:

Updated Packages
audit.x86_64       2.2-4.el6_5       rhel-x86_64-server-6
audit-libs.x86_64  2.2-4.el6_5       rhel-x86_64-server-6
avahi-libs.x86_64  0.6.25-12.el6_5.1 rhel-x86_64-server-6

Both of these tools provide the core data we need: package name and version. Apt even gives us a clue that it’s a security update - the presence of “-security” in the repo name. I imagine that yum can also provide that, I just haven’t found the proper command line argument to use.

The Next Step

Having this data is still not enough – we need to gather, store, and then process it. - To that end I’ll share a small Python program to parse the output from apt so the data can be stored. At &yet we use etcd for storage, but any backend data store will suffice. Processing the data for each server reflects the second step of our path - reducing the firehose of data into actionable parts that can then be carried along the path for the next step.

#!/usr/bin/env python
import json
import datetime
import subprocess
import etcd

hostname = subprocess.check_output(['uname', '-h'])
ec       = etcd.Client(host='127.0.0.1', port=4001)
normal   = {}
security = {}
output   = subprocess.check_output(['apt-get', '-s', 'dist-upgrade'])
for line in output.split('\n'):
    if line.startswith('Inst'):
        items      = line.split()
        pkgName    = items[1]
        oldVersion = items[2][1:-1]
        newVersion = items[3][1:]
        if '-security' in line:
            security[pkgName] = { 'old': oldVersion, 'new': newVersion }
        else:
            normal[pkgName] = { 'old': oldVersion, 'new': newVersion }
data = { 'timestamp': datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")),
         'normal': normal,
         'security': security,
       }
key = '/packages/%s' % hostname
ec.write(key, json.dumps(data))

When you run this, you will get an entry in etcd for each server, with a list of packages that need updating.

The remaining steps along the path are now attainable because the groundwork is done - for example, you can write other cron jobs to scan that list, check the timestamp, and produce a report for all servers that need updates. Heck, you can even use your trusty Ops Bot to generate an alert in your team chat channel if a server has gone more than a day without being checked or having a security update applied.

The point is this - if you’re not monitoring, you are guessing. The tool above enables you to monitor your installed package environment and that’s the first step along the many varied paths toward mastering your server environments.

● posted by Nathan Fritz

The NoSQL “movement” in database design has been motivated by many factors (such as simplicity and scalability) and has resulted in many more choices among storage and retrieval solutions. Instead of a one-size-fits-all approach, you can choose a database that is optimized for your specific needs.

So what are your needs, and how do you choose the right database for the job?

If you don’t need a database cluster, and can live with a single node and snapshot backups, then you can pretty much do whatever you want.

The CAP Theorem

If we assume you need a cluster for high availability, let’s talk about the CAP Theorem. Wait, I just lost some of you to eye rolls and “this doesn’t apply to me,” but please, hear me out.

The CAP Theorem states that a distributed database can only have two of the following qualities:

  • consistency (writes are in order, and you’ll get the same answer from multiple servers at any point in that order)
  • availability (if a database node is running, you can always write to it)
  • partition-tolerance (even if database nodes can’t communicate with each other, you can still read the data)

You’ve heard “fast, good, cheap: pick two” — right? Essentially, for any given data type, you get to pick two.

In practice, there are only two realistic choices for any given database: consistence and partition-tolerance (“CP”) or availability and partition-tolerance (“AP”). “CA” doesn’t really exist, because without partition tolerance, you don’t have availability. At least Kelly Sommers and Aphyr agree.

Some databases assert that the CAP theorem doesn’t apply to them. To see what happens when databases make claims that violate the CAP theorem, have a look at Aphyr‘s Call Me Maybe series. Not all of these databases are bad though, sometimes they’re simply claiming a tuned CP like FoundationDB or a way of resolving consistency for you (which you still need to be cautious and aware of).

What You Get With CP Databases

Most SQL databases claim CP. This means that they guarantee consistency. You can still read during bad partitions, and you’ll probably write to single place for any given table. But they’re not available. If the master is down, and another hasn’t been elected, you can’t make writes.

Most SQL databases scale very high for reads, because that load can easily be distributed, but they don’t scale horizontally for writes while keeping consistency; a master-master SQL cluster must resolve write conflicts with timing or app intervention.
SQL servers can become unavailable for writes during high load of the master servers and outages that only effect master nodes. If your data is primarily about reading (a product database, a user database, etc.), then SQL is an excellent choice, and will scale very well for you.

But let’s say your use case is collaboration. With an SQL database, you’ll soon have to start sharding your users and interactions to separate clusters to keep up with writes.
For collaboration use cases, your users are writing just as much as reading.

Considering AP Databases

It may be time to look at an AP database.

Some of the best AP database clusters are based on the Dynamo whitepaper by Amazon.

A database cluster that claims availability means that a large selection of nodes will need to be offline before you lose the ability to write and read a given piece of data. Generally databases with the availability claim work best when any given piece of data exists in multiple data-centers. An AP database also has partition-tolerance: the nodes can stop talking to each other, and your data is still there to read and write.

As the CAP theorem implies, the big downside to AP database clusters is the lack of consistency. This means that for any given piece of data, you might get different results depending on which node you ask. Generally these databases gather results, so that each node will eventually all agree on multiple versions of the truth.

This isn’t as bad as it sounds.

Lack of consistency happens in an AP cluster when a single piece of data has been written to in two different places, either in close proximity to time, or during a net-split (when the two nodes were unable to communicate due to network failure). Some databases have settings and even defaults for resolving these cases with the Last-Write-Wins solution. Keep in mind, however, that you’re destroying data from a confirmed write.

The best solution for lack of consistency is in your application itself, perhaps in the API service, for example. When you read a piece of data and get multiple results back, you should have a function for resolving consistency problems with that data-type. Merging the data, picking one over another, or creating a new data object are all possible solutions, and should be hand selected for each data type.

Hybrid Solutions

In the end, you may want to mix a single state server, a CP cluster, and an AP cluster together for the different data types that you use. This hybrid approach can help you strike the right balance when your requirements are complicated by the limits of the CAP Theorem.

Since this is my blog post, I’ll tell you about my personal favorites:

  • Standalone, non-clustered data, snapshot backups, slaving: Redis
  • Consistent, partition-tolerant, high reads, low writes: Postgres
  • Available, partition-tolerant, horizontal scaling: Riak

I’m sure I’ll have different favorites tomorrow. :-)

I’ve also been working on Dulcimer, which &yet uses to develop locally against an embedded leveldb, and deploy and scale against Riak (Postgres support is in progress). Dulcimer aims to be a single way of dealing with a wide variety of key-store technologies.

In the end, choose the best tool for the job, and don’t get wedded to any specific piece of technology.


If you have in-depth questions about choosing the right database for your needs, consider our architecture consulting services.

Feel free to comment directly to Nathan Fritz @fritzy.

● posted by Philipp Hancke

As you probably know, we run Talky, a free videochat service powered by WebRTC. Since WebRTC is still evolving quickly, we add new features to Talky roughly every two weeks. So far, this has required manual testing in Chrome, Opera, and Firefox each time to verify that the deployed changes are working. Since the goal of any deploy is to avoid breaking the system, each time we make a change we run it through a post-commit set of unit tests, as well as an integration test using a browser test-runner script as outlined in this post.

All that manual testing is pretty old-fashioned, though. Since WebRTC is supposed to be for the web, we decided it was time to apply modern web testing methods to the problem.

The trigger was reading two blog posts published recently by Patrik Höglund of the Google WebRTC team, describing how they do automated interop testing between Chrome and Firefox. This motivated me to spend some time on the post-deploy process of testing we do for Talky. The result is now available on github.

Let’s review how Talky works and what we need to test. Basically we need to verify that two browsers can connect to our signaling service and establish a direct connection. The test consists of three simple steps:

  • determine a room name to test against by generating a random number to use for the room URL
  • start two browsers
  • determine that the peer-to-peer connection is up and that video is running.

If the process fails in the staging area, our ops team will not deploy the new version to the main Talky site.

Although step one is easy, starting the two browsers is more complicated. When a user goes directly to a videochat room we show a “check-your-hair screen” which requires a user action to join. It’s already possible to skip this by using a localStorage javascript setting. This means we need to start both browsers with a clean profile and pre-seed the localStorage database with some of those settings.

To get around all that manual testing, we want to run these tests on servers and machines that don’t have any webcams and microphones attached. Fortunately, this is pretty easy to achieve because the browser manufacturers provide special ways to simulate webcams and microphones for testing purposes. In Chrome, this is done by adding --use-fake-device-for-media-stream as a command line argument when starting the browser. In Firefox, a special fake:true variable in the getUserMedia calls needs to be set (as explained here).

Since we don’t want user interaction, we also need to do something similar for skipping the security prompt. In Chrome, that is achieved with the --use-fake-ui-for-media-stream flag; in Firefox, this is done with by setting a preference of media.navigator.permission.disabled:true.

Next, we need to actually start the browsers in a way which doesn’t require any visible windows and works on headless servers as well. Fortunately, there is a Linux tool for this called xvfb. It is even used by a sample script that is part of Google’s WebRTC code available on github. After starting two browsers, we need to wait for them to become connected. This is relatively easy to determine by listening for the iceconnectionstatechange events of the WebRTC peerconnection API. Check the simplewebrtc demo page for a basic example.

We wait for this event to happen and then write something to the logs. In Chrome this is relatively easy, since normal console.log calls will be written to the log file on disk. In Firefox this turned out to be slightly more complicated: we need to set a preference browser.dom.window.dump.enabled:true and then use a window.dump call to write something to the standard output. For Talky, we log the string P2P connected. Other applications, such as Jitsi Meet can be tested that way as well. Our shell script then searches the logs for that string and if found, waits for another five seconds before declaring the test a success and exiting.

Sounds pretty simple, eh? It’s just putting a bunch of pieces together, building on the work Patrik Höglund has done and pushing it slightly further. The test saves us lots of time on each deploy and allows us to deploy changes to our new talky.pro service without headache. We can even run it continuously to check whether our service is up. We’re also integrating this technique into our software development process for all the Otalk WebRTC modules.


Want to get started with WebRTC? Check out our WebRTC consulting services.

Comment directly to Phiilpp Hancke @HCornflower.

● posted by Jenn Turner

Next Tuesday, &yet is hosting a blood drive put on by the American Red Cross.

For some, that’s maybe the most boring sentence they’ve ever read.

For others, who have been on a different side of that need, it’s another chance.

Chance? Need? Blood?

For some just reading the word “blood,” or thinking about the fluid is enough to trigger some sort of uneasy “I don’t fill-in-blank well with blank” reaction.

Coming to terms with the reality of components within us that keep us alive and breathing can be a jarring experience for some.

I’m not sure why, but when confronted with images of blood or needles, I get a slightly warm, queasy feeling that I want to just push away from myself and ignore until homeostasis returns. If I had to guess, I would blame part of it on social conditioning to be made uncomfortable by things that in my experience have only held a very important and sterile place.

But I think another part of it is being confronted with the reality that my life is finite. There are limited resources within my conscious being, keeping me alive. I could lose it all in an event completely out of my control, at any time.

So yeah, thinking about blood makes me uncomfortable. I don’t want to do it.

However, I would argue that coming to terms with something more devastating, like the death of a loved one from loss or lack of those life sustaining components–like blood–is a far more traumatic experience.

One I have had, repeatedly, from a disease called Leukemia.

Leukemia started stealing family members from me in high school, starting with my grandfather. The pattern continued in college with the death of a beloved second cousin, “Uncle” Jim. He was the one who told me “If you drink and drive, don’t smoke — you gotta have one hand on the wheel.”

If you are unfamiliar with Leukemia, let me introduce you to it by saying it sucks.

Leukemia is a cancer of the blood and blood cells, that comes in a variety of diseases. For more information on the illness, I’ll let the good folks at the Leukemia and Lymphoma Society educate you in far more articulate and less emotionally charged terms on their site.

(Interesting aside, only two percent of the American population donates blood regularly. If one more percent began donating on a regular basis, it would irradicate the shortage of supply. Just one percent. And 38 percent of the population is eligible to donate, meaning they meet the donor requirements, but only 10 percent ever do.)

Treatments for Leukemia include radiation, chemotherapy, anti-biotics and blood transfusions. Transfusions are often necessary because of the depleting effects the first two treatments above can have on a patient’s blood supply.

Transfusions give people time. Transfusions save lives.

I know I’m not saying anything you haven’t heard before, but if you’re like me, you’re probably still somewhat hand wavy about feeling bad for that bad stuff that happens to people sometimes who are not you.

True, Leukemia hasn’t happened to me, personally. It did remove individuals from my life who I used to be able to see and talk with and hear their voices and feel their hugs and have their presence in my life.

But, since these people are my family, I very well could have that sneaky genome for Leukemia lurking in my system, to someday steal me from my daughter, my family. And if that happens, I want to fight. I want more time.

I like being alive. As I’m guessing, most people do.

I hardly ever think about how at least three pints are needed for a transfusion, and that in the event of an accident, if I’m taken to the hospital, that those pints are already there on the shelf.

Because someone made the conscious contribution to save my life.

I’m asking for your help in this. One pint of your blood has several different components and has the potential to save three lives. Three human beings.

Sure it’s uncomfortable, but it’s an easy way to make the world a better place by giving the least of yourself so that someone else can live.

Be a donor.

If you’re in the local Tri-Cities area, come to the &yet office at 110 Gage Boulevard, Suite 100 in Richland, Washington this Tuesday, September 30 from 10 am to 3 pm. I will greet you at the door.

If you’re inspired to donate in your area, visit the American Red Cross at www.redcrossblood.org to find a drive or center in your area. (It’s easy, I promise, there’s a blue box at the top right hand corner that says “Give Blood. Find a Blood Drive.”)

Every little bit counts.

Thank you.

● posted by Peter Saint-Andre

More and more application developers have come to rely on platform-as-a-service providers for building and scaling software.

WebRTC’s complexity makes it ripe for this kind of approach, so it’s no surprise that so many early WebRTC companies have been platform service providers. Unfortunately for customers, the nascent Rent-Your-WebRTC-Solution market has proven pretty unstable.

News came yesterday that yet another provider of WebRTC hosted services—in this case, Requestec—has been acquired. We’ve seen this movie before with Snapchat’s acquisition of AddLive and we’ll probably see it again, maybe multiple times.

At &yet, we’ve been working steadily at creating open source software and approaches to infrastructure to help our clients avoid the volatile WebRTC rental market.

We believe we can help use and create common standards while working collaboratively with a coalition of developers who want to push WebRTC forward in an open way.

We’re pretty old school. We believe that’s the way the best parts of the web have always been, and if it’s up to us, that’s the way the web—including WebRTC—will always be.

As with any software, yes, you could build your entire platform yourself—from frontend to backend, from clean-coded clients to sustainable ops, including scalable and secure signaling, NAT traversal, and media servers. But it’s awfully expensive and time-consuming to do that, even if you can find folks who have strong competence in this field. (And we can vouch that it’s not a trivial undertaking, because we’ve spent a year and a half doing it!)

So you could rent, you could build your own, or you could work with a technology team that has the following qualities:

  • They’re experts on all aspects of WebRTC
  • They’re veterans with many other realtime technologies
  • They produce code libraries that are extremely easy to build upon
  • They’re dev-friendly, ops-knowledgeable, and totally approachable
  • They’re dedicated to teaching and open-sourcing everything they know
  • They’re even willing and able to help you run the platform yourself
  • They’re completely independent and bootstrapped so they’ll never sell the company—period

Sounds like one of those gauzy dream sequences in a cheesy movie–the kind that’s shown right before someone gets hit with a large dose of cold, hard reality.

But this movie has a happy ending, because at &yet we’re just that kind of team.

Want to write your own story when it comes to WebRTC?

We encourage you to embrace the open platform that is Otalk, check out our WebRTC consulting services, and drop us a line to start a conversation.

● posted by Nathan Fritz

At &yet, we’ve always specialized in realtime web apps. We’ve implemented them in a wide variety of ways and we’ve consulted with numerous customers to help them understand the challenges of building such apps. A key difference is that realtime apps need a way of updating the application without direct intervention from the user.

Growing Pains

What data you send, and how much you send, is completely contextual to the application itself. Your choice of transport (polling, long-polling, WebSockets, server-sent events, etc.) is inconsequential as far as updating the page is concerned. App experience and performance are all about the data.

In our earliest experiments, we tightly coupled client logic with the updates, allowing the server side to orchestrate the application entirely. This seems rather “cool,” but it ends up being a pain due to lack of separation of concerns. Having a tightly-coupled relationship between client and server means a lot of back and forth, nearly infinite amounts of pain (especially with flaky connections), and too much application orchestration logic.

We moved on to simply giving the new data to the client when things changed, which removed all of the orchestration pain and gave more control to the application. Even this had some interesting pain points when dealing with shifts from offline to online, cache control, and lack of control over memory. Here are some of the problems we ran into:

  • When replaying events after a reconnect, the service had to remember what should have been sent, in what order, for that user specifically.
  • Similar issues arose when switching subscription states on data.
  • Applications that simply couldn’t handle all of the data being loaded into the app at once had a lot of timing issues with updates and API calls.
  • If your permissions were tricky, pushes became an extra place that they had to be checked.

Getting the Hint

Eventually we realized all of these problems could be solved with hinting. Rather than sending the data, we send the data type, the ID, and whether it was an update, a delete, or new data. The application then requests the data it cares about - and only the data it cares about - from the HTTP API.

Now after a reconnect, we can query for a set of IDs that have changed since that time plus some extra for good measure, without caring about order at all. Since all of the data comes from the API, we don’t have timing issues with API data versus update data, and we have a single source of truth and a true separation of concerns. The application can now keep caches, and mark the cached data as dirty or delete it accurately over time, whether the data is being displayed or just kept handy.

All-in-all, simply hinting data changes removes a lot of tricky edge cases, and empowers the application developer to control their data effectively. Hinting FTW!


We can help you leverage our expertise on architecture.
Feel free to comment directly to Nathan Fritz @fritzy.