Changing at Euston and Sweating The Details

I was heading home from dinner1 with M., and Apple Maps told me that the overground to Stratford was cancelled. That was very useful, and meant that I headed towards Queens Park, rather than Brondesbury. At Queens Park, Citymapper helpfully pointed out that the Overground to Euston and then Victoria was about five minutes faster than Bakerloo and Victoria via Oxford Circus. That was also very useful.

But where Citymapper really stood out was that it showed me that it took five minutes to get through Euston. That they showed it was a curious bit of product transparency, and drove a number of questions about how they arrived at the design (and populated it in their map database.)

It also made me time the transfer. It took a little under four minutes, but I was walking quickly. So it was well within the expectation that Citymapper gave me, and increased my general trust in the app. I also somewhat believe that Citymapper sent people out to walk the various transfers. 2

I hear a lot of exhortations about good design being sweating the details. This seems like a good one, worrying about just how navigable a train station is. (Let’s take TfL’s own Monument/Bank as the object example of just how far a transfer walk can go.) As I am refining my user research practice, I am becoming very aware of how the details can really impact research. There’s one great example of how details matter, and how getting into the weeds can make a difference.

Over the past three or four months, I am realizing that a deep and comprehensive understanding any classifier or detector is an absolutely crucial detail for an embedded product researcher. Even with the sophistication of the classifiers at Meta, every time I have launched a product that depends on the output of a classifier, I have had at least one surprising or unexpected result. These results have ranged from an interesting insight on user behavior to an the-entire-experiment-was-pointless finding. (Unfortunately, recently, it’s been more towards the latter end of the range, which is probably why this is so close to my heart now.)

Tomorrow, I’m going to start taking a hard look on a detector for an upcoming product experiment. It’s about an important product area, and I’m really worried that our false positive rate is going to be very high. There is also a good chance that a big part of the false-positive is being driven young adult creators3, and there may be a iron trade-off between “voice” versus safety. In other words, it is impossible to improve a (under-18) creator’s ability to attract new followers without hurting some aspect of their safety.

Sometime soon, I’ll spend a bit more time on the approach I’ve used to understand my classifiers (or, how I find the big issues.)

  1. It was hard to tell this was “Australian” in any fundamental sense. The menu was “bistro generic,” and at least the part of the wine list (cheap) that I looked at was Italian and Spanish wines. Both of us had the schnitzel, which was excellent. 

  2. For a UK-focused company, this isn’t really a huge expense. Even for a city like London, it would take two people about a week to case out all the transfers. Back of the envelope, it’s about $6K, and that’s assuming that labor is $60/hour. You could do the rest of the country for another $12K, probably very well. 

  3. Yes. I said “creators” and “engagement” somewhat non-ironically. I am developing an appreciation for what I call the “passionate amateur” niche, and “creator” seems like a good enough name for it. And “engagement” covers a lot of territory (although, I think sometimes we make it sound much more, um, engaging, than our data supports.)