D3.js is Not a Graphing Library, Let's Design a Line Graph

—Friday, June 24 2011

Working with graphing libraries can be tedious. Designing them can be downright frustrating. Each one of them slightly different, but most of them share two common flaws: a design-by-configuration and template design approach. A bar graph can be just a few bars with labels and tick marks…until it isn’t. Want to change the background color? New option. Want to change a bar color? New option. Want to hide the x-axis labels? New option. Want to highlight a specific bar with a different color? Good luck doing that.

As long as you stay within the confines of the template, it’s simple, but, anytime you want customize a specific aspect of the original template, more configuration options are added to the library. You should avoid “design by configuration.”

I’m going to use jqplot, a pretty popular graphing library, in the following examples. This isn’t an attempt to single out jqplot. Many graphing libraries use this template approach. I’ve went down the rabbit hole myself. With that out of the way, lets take a look at a simple line graph in jqplot.

Before we begin: This is not a LOC comparison. This is about writing code that makes sense. I’m using CoffeeScript in most places and JavaScript where appropriate. If you’re not familiar with CoffeeScript, you can follow along with the JavaScript code. The complete CoffeeScript file can be found here.

var plot1 = $.jqplot ('chart1', [[3,7,9,1,4,6,8,2,5]]);

This looks simple enough. First we include the proper renderers and then we create the default graph by specifying an element and passing in a nested array of data points. Once this renders, we end up with labels, grids, colors and shadows which were never specified. We’ll need to configure the graph to adjust or remove these items.

jqplot line

What happens when we need to tweak a few things? We end up with a crazy mix of nested hashes that become really hard to parse. And if you need more than one axis, things can really get crazy. In this jqplot example, “up to 9 y axes are supported”. I’ve never seen a graph with 9 y axes! The larger question is, why would you stop at 9?

var plot = $.jqplot ('chart2', [[3,7,9,1,4,6,8,2,5]], {
    // Give the plot a title.
    title: 'Plot With Options',
    // You can specify options for all axes on the plot at once with
    // the axesDefaults object.  Here, we're using a canvas renderer
    // to draw the axis label which allows rotated text.
    axesDefaults: {
      labelRenderer: $.jqplot.CanvasAxisLabelRenderer
    },
    // An axes object holds options for all axes.
    // Allowable axes are xaxis, x2axis, yaxis, y2axis, y3axis, ...
    // Up to 9 y axes are supported.
    axes: {
      // options for each axis are specified in seperate option objects.
      xaxis: {
        label: "X Axis",
        // Turn off "padding".  This will allow data point to lie on the
        // edges of the grid.  Default padding is 1.2 and will keep all
        // points inside the bounds of the grid.
        pad: 0
      },
      yaxis: {
        label: "Y Axis"
      }
    }
  });

D3: Layers, Shapes, Text and Scales

20080805 : Be@rbrick Karimoku Layered Wood

What about D3? D3 is a relatively new visualization library. Created by the extremely talented and proactive, mbostock. D3 can do many things (that we’ll get into later), but for now, let’s define what a “graph” is.

At its core, a graph is just layers of paths, primitives, color and text—something SVG is perfectly suited for. Lets take a look at a simple SVG line graph in d3.

  data  = [3,7,9,1,4,6,8,2,5]
  w     = 700
  h     = 300
  max   = d3.max(data)

  # Scales
  x  = d3.scale.linear().domain([0, data.length - 1]).range [0, w]
  y  = d3.scale.linear().domain([0, max]).range [h, 0]

  # Base vis layer
  vis = d3.select('#chart')
    .append('svg:svg')
      .attr('width', w)
      .attr('height', h)

  # Add path layer
  vis.selectAll('path.line')
    .data([data])
  .enter().append("svg:path")
    .attr("d", d3.svg.line()
      .x((d,i) -> x(i))
      .y(y))

See example in action →

Note: Many methods in d3 (data, attr, x, y, etc.) will evaluate the first function passed to it using the current data point. For example: .y(function(d) { return d; }) is the same thing as .y(yfunc) Also, every attr call is part of the SVG spec. There are no “magic” attr values.

Lines 7 and 8 are where the magic happens. These two lines return functions that accept a single argument represented in the input domain. The x scale above is saying: “I want a linear scale that represents data between 0 and 8 and I want the values returned to fit in the pixel range of 0 and the width (w).” Lets see that expanded:

  x = d3.scale.linear().domain([0, 8]).range [0, 700]
   # If given a value in the domain, should always return a number between 0 and 700
  console.log x(8), x(4)

Lines 11-14 are pretty straight forward. They are responsible for appending the svg element to the DOM and setting up some basic dimensions for our document.

Lets look at lines 17-22, which can be thought of as the first layer in our graph. We supply the data here and generate the resulting path data. If we were drawing more than one line, our data might look like .data([data1, data2]). In this example, we’re only plotting one dataset so we have an array with a single array entry. Moving along, we then get to the oddly named enter method. I like to think of enter as find_or_create, but be sure to check out the docs for an expanded explanation. Finally, we append the svg path element and using a path data generator to generate the path data. Lets take a look at the path data generator:

d3.svg.line().x((d,i) -> x(i)).y(y) 

We’re using the scale functions we defined earlier. The i argument passed through to svg.line’s x method is the current index. So x and y will be called for each piece of data in our dataset passing the current piece of data as the first argument and the current index as the second argument. After the code above is executed, we end up with a pretty simple svg.

<svg width="700" height="300"><path d="M0,200L87.5,66.66666666666669L175,0L262.5,266.6666666666667L350,166.66666666666669L437.5,100L525,33.33333333333337L612.5,233.33333333333334L700,133.33333333333331"></path></svg>

Note: If you find yourself writing for loops with d3, you could be doing it wrong. Very rarely will you need to do this.

CSS: It Just Works

Many existing libraries are implementing using Canvas. It’s not possible to style objects drawn to a canvas element using CSS, but that hasn’t stopped people from trying. SVG on the other hand? SVG has excellent CSS support. Lets style the path we’ve drawn above:

path {
  stroke: #c00;
  stroke-width: 3px;
}

That’s it! We can put this in the same stylesheets we’re already using to style DOM elements. If a designer comes along and wants to tweak a few things, there is no need to go digging around in JavaScript.

Caveat: If you want to open SVG renders in a program such as Adobe Illustrator and retain styling, you’ll need to use attributes instead of externally defined CSS.

Grouping: Labels and Grids

When you have elements that share a similar position, it’s really handy to use the <g> element. All children of a parent <g> element will share the same relative position. For instance, a tick mark and its accompanying label should be positioned pretty close to each other. We don’t want to run this computation more than we have to, so we use a group.

# Add tick groups
ticks = vis.selectAll('.tick')
  .data(y.ticks(7))
.enter().append('svg:g')
  .attr('transform', (d) -> "translate(0, #{y(d)})")
  .attr('class', 'tick')

# Add y axis tick marks
ticks.append('svg:line')
  .attr('y1', 0)
  .attr('y2', 0)
  .attr('x1', 0)
  .attr('x2', w)

# Add y axis tick labels
ticks.append('svg:text')
  .text((d) -> d)
  .attr('text-anchor', 'end')
  .attr('dy', 2)
  .attr('dx', -4)

See step 2 in action →

We’ve already covered most of what is going on in the code above. The interesting part can be found on line 3: .data(y.ticks(7)). Remember those linear scales we defined earlier? ticks is a method which will return a “uniformly spaced, human readable values guaranteed to be within the extent of the input domain.” In layman’s terms, if the linear range is 0-10, y.ticks(5) will attempt to return 5 numbers, evenly spaced between 0-10. Try it yourself in the console:

y = d3.scale.linear().domain([0, 10])
y.ticks(10) // [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
y.ticks(5) // [0, 2, 4, 6, 8, 10]

y = d3.scale.linear().domain([0, 500])
y.ticks(10) // [0, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500]
y.ticks(5) // [0, 100, 200, 300, 400, 500]

Moving on, you’ll notice the transform attribute. This is the svg transform attribute. In this particular case, we’re using our y scale to position the group. Can you guess what d is here? It’s each entry in the array of tick marks we talked about above.

Further down, you’ll notice we don’t have to position the svg:line elements because they are being positioned relative their parent group. Finally, we add the actual svg:text label, and use dx and dy to make small tweaks to the text relative to the current x and y locations.

Selectively Highlighting Data

Sometimes we want to make a particular datapoint stand out. Whether we use color, visibility, or another method, it can be very difficult or impossible with template-based graphing libraries. D3 makes this very easy given most methods allow you to pass along a function to evaluate with the current piece of data. For example, I chose to highlight only the max datapoint when plotting a 30 day trend over at Portland Crime.

Let’s add circles to each data point along our line and highlight the largest data value using color and size.

# Add point circles
vis.selectAll('.point')
  .data(data)
.enter().append("svg:circle")
  .attr("class", (d, i) -> if d == max then 'point max' else 'point')
  .attr("cx", (d, i) -> x(i))
  .attr("cy", (d) -> y(d))
  .attr("r", (d, i) -> if d == max then 6 else 4)

See step 3 in action →

Lines 5 and 8 are what we’re looking at. For every data point passed in, we check if it is equal to the max, if it is, we add a new class and increase its radius (r) value.

Events: Making it Interactive

Interactivity is a great way to liven up a graph. In our case, we’re going to tweak the radius of the points when someone hovers or leaves one. You could also use the same technique for adding tooltips or updating another DOM element. It doesn’t have to be inside the SVG document. Let’s modify the code above.

# Add point circles
vis.selectAll('.point')
  .data(data)
.enter().append("svg:circle")
  .attr("class", (d, i) -> if d == max then 'point max' else 'point')
  .attr("r", (d, i) -> if d == max then 6 else 4)
  .attr("cx", (d, i) -> x(i))
  .attr("cy", (d) -> y(d))
  .on('mouseover', -> d3.select(this).attr('r', 8))
  .on('mouseout',  -> d3.select(this).attr('r', 4))
  .on('click', (d, i) -> console.log d, i)

See final graph →

CSS & Attribute Precedence: Be careful! If you define a style such as stroke-width in a stylesheet, it will take precedence over the attribute stroke-width. If you’re attempting to overwrite a style of an element, use the .style method.

Lines 9-11 are what we’re looking at. The on method allows us to attach an event to a node. When the event is triggered, a callback is fired, passing in the current piece of data and an index. All we’re doing is increasing and decreasing the radius of the current point.

Inside most methods, this will refer to the current element in the SVG document. It isn’t “prewrapped” with d3 selection methods, so you’ll need to run d3.select(this) if you intend to use them.

Lesson for the Reader

  1. Those two 0s at the bottom left look pretty cramped. I think both of them can be implied. Can you get rid of them while keeping the other numbers intact?
  2. The line looks a little rigid. Try experimenting with the interpolation and tension.

Final Thoughts

D3 isn’t a graphing library. It’s so much more. It’s Cartograms, Choropleth Maps, Chord Diagrams and a whole bunch of other stuff. On a camping trip, it’s the tool you never leave behind. If you’re fishing, it’s your tackle box. If you’re designing a line graph for the web? It’s your graphing library.