Small things done well¹

Bad design is bad

In 2015 Moxie Marlinspike pointed out that the manual page for GPG is (now) 50% of the novel Fahrenheit 451. Any software whose man page approaches 20 thousand words better have a good excuse, and GPG can only gesture vaguely at decades of questionable design.

GPG gets a bad rap but it isn’t really much of an outlier. Security software has a long history of crumby, unintuitive interfaces and terrible design choices. A deep dive into the factors behind awfully designed security software isn’t the purpose of today’s blogpost, but suffice it to say there is seldom pressure from the end users. Security software mandated by a security team is often rammed down users’ throats, so it doesn’t bother being pleasant. It’ll sell anyway.

We’ve worked hard to buck this trend from our first version. It’s one reason why we are one of the few pieces of security software that customers actually talk about in terms of love:

https://canary.tools/love

Recently, we released a major update to our Console. It’s been simmering for ages now, and has some really subtle flavours in the details. We figured we’d highlight a few of our favorite bits (and the thinking around them).

At the outset it’s worth knowing that we have always designed our Console to not keep our users hemmed in. While many security vendors try hard to be the “single pane of glass” commanding your daily attention, our goal is that you almost never need to login. Ideally, you'd set up your birds on day-1, and never go back until it mattered. 

Learning through experience

We rely on a third party accounting tool to help manage quotes, invoices, receipts, and the rest of the financial admin that a good-looking company such our ourselves might require. A while back the accounting software vendor sent a mail announcing a major update that included significant interface changes. They announced this as “good news”, but we treated it like a bad smell:

We were happy with our accounting package. We didn’t want “big changes” in it. We wanted to build awesome software and not think about our accounting package for 10 seconds more than we had to. A bit of reflection on this experience caused us to reevaluate our new Console design. Was this how our customers would feel too? We were about to change their interactions with us. Did they need this change?

Everything is Different (not!)

The new update was built around a handful of key features. Huge amongst them was that we moved to a modern front-end framework and introduced the concept of grouping birds. The path to the final design was long and twisting. We experimented with a number of ideas to group birds (flocks!) and tried prototypes of many of them. We just couldn’t get comfortable with almost all of them.

In light of the accounting software experience, we went back to the drawing-board (several times) till we could make sure that a user who wasn’t signing up for any of the new functionality would effectively get a Console with almost no changes.

Instead, they’d be greeted with a familiar layout and just a touch of added colour. This also helped shape our transition plan (for moving customers onto the new interface) and set the boundaries for the rest of the design work.

Let’s talk specifics

With the major design considerations out the way, we want to introduce a bunch of the small touches that are impactful for users.

Cut and paste

We know that if someone is using the Console, they are probably going to be digging into incidents and passing around information. So we’ve made sure that all the fields you could need are easily copyable (and copy to a sane format). No need to highlight fiddly text and copy it just right. We take that problem away with handy Copy-actions next to each field.

The “search anything” Search

When we first built Canary, most of our users had 2-5 birds (and no Canarytokens). Now we have customers with over a 1000 Canaries and hundreds of thousands of Canarytokens. This means that our previous concept of listing Canaries (or tokens) falls away, which is why the new Console features a handy new global search box. Just start typing, and we will hone in on the bird, flock, token, alert or artefact that you want to look at.

It’s tucked away silently at the top of the screen, but once you get used to it, you will never look back.

Defenders (should) think in Graphs

Canary alerts are simple, and one alert is generally enough to let you know you’ve got to cancel your plans for the weekend, but what happens if you walk into 10 of them? Viewing the alerts as a graph allows you to spot at a glance if it’s 10 attackers attacking 1 server, or 1 attacker targeting 2 machines.

We previously created “Graph View” to address this, but it was a few clicks away and was less polished. That’s now changed with our new and improved graph view.

With cleaner elements and connections (and some simple animations) the graph view is now easier to use and understand than ever before. We think it’ll be the default view for many customers, giving them a quick glance into exactly what’s happening with their birds. 

Context and Cards

Although Flocks give our users finer control over their Console, they potentially add new levels of complexity to the UI. Previously, things were simple. You would click on a bird to see its settings, or click on an alert to view its details. There wasn’t any additional context to hold.

This changes when you group birds into Flocks. From a UX perspective, the additional context raises thorny issues. For example, if the New York flock is currently active on screen and an alert arrives from a bird in the Mumbai flock, should it be displayed and, if so, where? How do we view settings that apply only to the Cape Town Flock? Keeping that context clear has the potential to add an unpleasant cognitive load on the user.

If we simply used modals at each step, a user would very quickly get lost (and likely frustrated) in a deck of multiple open modals. We want to give our users these new features, but we don’t want to complicate the UX.

In order to handle these context switches (from global views to drilling down into a specific Flock and beyond) we use a combination of cards and transitions. Cards here are, essentially, panels which take over the current visual context without anything behind them (ala modals). The cards allow us to encapsulate the current context (where the user is). The transitions allow us to flow into this new context logically, and allows the user to intuitively understand the change.

Let’s look at clicking on a Flock:

The clicked flock card slowly transitions from being in a list to now becoming the focal point. The user’s view narrows from the global context, to the Johannesburg Office flock. The card encapsulates the new context and the transition allows the user to mentally follow the switch. No other content is visible behind the card, so the user knows this in only one place. Multiple cards cannot be open at the same time.

A user clicking on the “Johannesburg Office” flock sees a transition. The flock moves from one of the listed flocks to grabbing focus and then expands to capture all of the user's attention. It then dominates their view and becomes their new context.

It was important for us to get this right. We added new features and complexity, but we didn’t want the user to have to pay for it. We needed to put in the effort and take the complexity away from the user.

Animations aren’t for everyone

Natural animations play a big part in guiding the users’ understanding of their context in the Console. Clicking on a Flock expands its card to fill the screen, since it’s now the focus of your attention. Want to edit the Flock’s settings? Flip the card around. But it’s possible that once you’ve gotten used to them, you no longer need the animations to give you context. So we’ve taken the unusual step of letting you turn off animations with a toggle. (Yeah, we like them and think they make the app better, but we aren’t going to force them onto a user that sees no use for them.) 

Inyoni

Along with the new look, we’ve introduced Inyoni. Inyoni, which means bird in Zulu (one of South Africa’s 11 official languages), may be seen hopping around the login form or periodically popping in.

Inyoni isn’t completely gratuitous and isn’t just about making a cute mascot. He also fills a real need. There are times when you could be on a section of the site where you won’t notice an alert come in. Canaries throw such few alerts, it would be a shame to miss one! So if you are scrolled away from your alert list at the top of the page (and only if you are scrolled away from your alert list), then you will hear a pleasing pop and Inyoni will politely nudge in to let you know that there’s something for you to check out.

Animated fav icon

Similar to the Inyoni pop, If a user is busy on another tab and a new alert comes in, we add a small animation to the fav icon. It’s relatively inconspicuous, but adds a heads-up that might be useful. (This is a little trickier than it looks, not all browsers support animated gifs in the favicon).

Artwork

Not being satisfied with only creating Inyoni and the favicon animation, Max helped us spruce up our artwork for this release as well. Instead of stock icons, we now use custom artwork he’s painstakingly created. You’ll find these amazing sketches in the Console:

Domain name slide

With all of our practical features, we have to confess that this one was added purely because we love how it looks! When a customer logs in, they see their domain name in the top left corner. Instead of it scrolling off the page as you scroll down, it now gradually decreases in size until it tucks up neatly under our logo (which also gently gets nudged up to make space for the text). It’s a little thing, and will likely be missed by most users, but it makes a bunch of us happy every time we see it.

Email Interactions

We are strong believers that design isn’t just about how things look, but about how they actually work. We make sure that even our emails to customers try to make things easier. When we send an alert email, the email includes actions that the user can take immediately at the bottom of the mail. (The user can choose to acknowledge the incident or add the source IP to the ignore list, for example.) Now most users might not even notice, but we do a bit more work on the back-end, so a user clicking one of these buttons doesn’t actually need to login.

This removes a few clicks for users, but it makes us incredibly happy knowing we’ve saved them a few clicks.

We do the same thing with our weekly Console round-up. Every mail includes an unsubscribe link that won’t try to guilt you to stay, and won’t ask for you to explain why you no longer want to receive emails. Click the link, good bye!

Integrations

Our view on integrations is that security vendors over-index on 3rd party integrations. “What products do you integrate with?” is a pretty standard sales question so it’s become the norm to tick as many of these boxes as possible.

This often adds unneeded complexity to the product and just as often, does so without adding extra value. We pride ourselves on a simple product with high fidelity alerts, so we’ve largely avoided this.

With the new Console, we’ve also included our first 3rd-party integration: Rumble, an ultra-light and quick network discovery tool. We’ve made sure that the integration is light-weight too, and customers who don’t use Rumble (or specifically turn the integration off) won’t ever notice it. Those who do, will be able to quickly query Rumble for more information on IPs seen in Canary.

Again, this integration “just works”. If you are logged into Rumble, we’ll automatically detect it, and will present you with a lookup link. If you aren’t, you’ll never be bothered by it at all.

Conclusion

We work hard to delight our customers. Whether it’s through support interactions, our birds, or, in this case, small UI changes that may well go unnoticed. We love them!

Security tools don’t need to suck. Sucking less actually means that users are more likely to actually use them to their full potential, and in the case of security tools, this is better for us all.


______

¹ With apologies to Rands

Something fresh

This month we’re ready to release our first major Canary Console overhaul. We’ve obviously pushed updates to Canary and the Console weekly for almost 5 years but this is the first time we’ve dramatically reworked the Console.


Contrary to a bunch of other products, we don’t want to be your single  pane of glass, and work really hard to make sure that most customers never have to spend time in their Console at all. But our beefed up Console offers you a bunch of  fresh possibilities, and we figured we’d introduce some of them here.


What’s different?

The first thing that a new user should notice, is that it doesn’t feel that different to the old Console. It has a new coat of paint, and some things look slicker, but it feels like just a slight visual upgrade on the original Console.


This is completely by design, and belies a bunch of changes beneath the surface. It’s practically a trope that just as users become familiar with a product, the vendor drastically alters the user interface forcing users to re-learn flows which were previously easy. We hate this. Tools are supposed to make our lives easier, not periodically give us pop-quizzes.
 
We know that there’s a fine-line between keeping the product familiar, and introducing new features (or refreshing the look). Throughout this process we’ve tried to keep a clear view of this line. We’re really excited to show a few of the enhancements that are deployed to customers as of today.

From the screenshot above, you can see a few of these right off the bat. 

Search

The new search-box at the top means that you never have to hunt for things again. When we first built Canary, customer Consoles had 5–10 birds in them (and no Canarytokens). Today, we have Consoles with hundreds of birds and tens of thousands of active Canarytokens. The search-box allows us to find things, even if we aren’t really sure what we are looking for. The search feature will let you find Bird, Incidents, Canarytokens and more without having to hunt, and you can search on pretty much any data tied to them.


A better graph-view

We’ve made a heap of improvements to our graph-view, which displays your alerts graphically rather than in table form. This is especially useful if you get a bunch of alerts; a quick click on the graph-view button will immediately clarify if it’s an attack sourced from a single or multiple locations in your network, and show you the birds involved. 


Birds of a Feather

The biggest improvement with the new Console is the ability to group birds into flocks. We spent heaps of time making this simple and intuitive so using it should feel pretty natural.


Once you’ve created a flock, any birds or tokens you previously had enrolled will be sitting in your “Default Flock”. These can be moved over to new flocks if you choose.

Of course the point (and joy) of having different flocks, is that you can treat them differently, so all of your flocks can have different settings, different users, and even different alerting rules.


User Management

Although we’ve supported the ability to add and remove users from your Console for a while, you now have much finer grained control of your users and their permissions. You can add users, restrict them to just a single flock, allow them to only deal with alerts, and delegate managing the actual birds to further users. The permissions model is simple: you can watch flocks, or manage them.


Canarytokens

Canarytokens also gets a refresh in the new Console but it brings a bunch of utility that is deserving of its own post (next week). It’s especially useful to be able to place them in different groups. Shortly we’ll be releasing updates to the Console API alongside helper utilities to make it easier to deploy them by the dozen (or dozen dozen) inside your networks.


Audit Trail

If you’re an admin user, you now have access to the Audit Trail, which gives you detailed information on all activities performed on the Console. (You can also download a JSON dump of all activity if needed). The audit trail backend code has been in place for a while, so your audit trail is already populated with a bunch of your activity.


Support and our Knowledge Base

We try hard to make sure that Canary is easy to use, and where options need explanations, we usually build this into the app. However it’s still possible for users to lose their way. The Console now includes a constantly visible link to “help” that’s backed by a heavily populated Knowledge Base and a pretty decent search. You can still use the interface to send us a support request, and our helper elves will be super-quick to respond, but the KB should make things easier.


Email Changes

The new look also means that things like your weekly newsletter get a much-awaited visual upgrade but some changes are also functional.

Emails now include single-use buttons, to Acknowledge or Delete alerts (or to add them to your ignore list) which don’t require you to login. (This allows you to react quickly from your phone/mailbox. We really mean it when we say we don’t want to be your single pane of glass.)

Easy copy & paste

We’ve added a bunch of convenience functions to make sure that getting data out of the Console is quick and simple. Most data fields have a neatly hidden Copy button you can hit to grab data into your clipboard.


Authentication

We already support SSO and you’ve always been able to make use of Duo / Google Authenticator / Authy for MFA. The new Console adds the awesomeness that is WebAuthn to our authentication defense lineup.


More to come

There’s  a bunch of other features that we can’t wait to share with you, and in the coming days will release more blogposts. For now, take it for a spin. It should be all Canary: Simple, and easy to use. Drop us a note with your thoughts!


3rd-party API-Key Leaks (and the Broker)

INTRODUCTION

Continually refining our security operations is part and parcel of what we do at Thinkst Canary to stay current with attacker behaviours. We’ve previously written about how we think about product security (where we referenced earlier pieces on custom nginx allow-listing, sandboxing, or our fleet-wide auditd monitoring).

Recently we examined our exposure to API key leakage, and the results were unexpected.

THIRD PARTY API KEYs

Like most companies, we use a handful of third-party providers for ancillary services. And, like most providers, they expose an API and give us an API key. A short time back as part of an exercise in examining our internal controls relating to third-party API keys we asked:
  • has an attacker grabbed this key?
  • has she actually used this key ?
  • what did she do with this key?
It turns out that even really popular service providers, by default, provide very crumby answers to these questions. 

That’s quite a conclusion to reach.

To be clear, most providers expose their service logs (i.e. what did they do for you), but few expose API logs (what Thinkst’s API key did or attempted on the service). Consider a third-party transactional mailer service which sends emails on your behalf. An API key lets you send emails, but it also lets you query the API to recover previously sent emails (within a time window). With one provider there’s simply no way for us to determine whether our API key has ever been used to retrieve old emails; API logs aren’t available.

The tl;dr is that if we expect to have answers to these questions, we have to take care of creating those logs ourselves. But how?

OUR USE OF API KEYs

In our current model, we have hundreds of customer Consoles that hold an API key for mail and SMS providers. Both our mail and SMS providers restrict us to a single key, so we end up with all those consoles using a shared key. 



This isn’t a train smash, but what if an attacker compromised a single customer console?

With enough privileges, they’d be able to grab those keys and start to query the provider APIs. (It would be great if we could have separate keys, or if we could ask useful questions of the providers). We can’t, so we built the “broker”.

THE BROKER

Conceptually it’s pretty straightforward. The Broker is a proxy which sits between our Consoles and third-party providers. The actual API keys are stored on the Broker, and not the Consoles. So a breached Console cannot reveal valid API keys (because they’re absent). Instead, we generate replacement keys, unique to each Console, as part of our configuration management process.



As a single go-binary, the broker has a pretty small attack-surface, and logs all of its actions to our ELK stack.

This position on the hot path between API consumer and provider lets us add a bunch of coolness:
  • We create individual keys for each of our Consoles, even if the provider only gives us one. That means a breach on any Console doesn’t yield access to the third-party API key, and activity can be traced back to the individual Console;
  • We log all interactions. This gives us our audit trail which the API providers are unable or unwilling to expose;
  • We can cycle keys on a penny (and have added this into our configuration process);
  • (Down the line we can even do things like block certain API calls to certain keys).
This is conceptually similar to Diogo Mónica and Nate McCauleys crypto-anchors (which is a 2014 talk worth watching)..

LIMITATIONS

An API broker like this isn’t for everyone. We now have an extra service to maintain and there’s a single service holding multiple API keys (but we are pretty happy with the trade-off).

In terms of scaling, while a proxy is a chokepoint, Canaries don’t generate loads of alerts and we haven’t seen bottlenecks (should that situation arise, load balanced Brokers are possible but not worth the added complexity right now).
 
We considered using one of the big API gateways, but decided against it. While they seem to focus on API’s going in the other direction they can be twisted to suit our needs. We just didn't want a huge, relatively untrusted code-base fulfilling this role for us.

WHAT’S NEXT?

We’ve been running the Broker internally for a while and every day thousands of emails and SMSs are sent through the little broker that could. We’ll add a little more documentation and will throw it up on our Github account for others to use. If you are interested in playing with it, drop us a note @ThinkstCanary or at research@thinkst.com


A Steve Jobs masterclass (from a decade ago)

A decade ago, Steve Jobs sat down at the D8 conference for an interview with Kara Swisher and Walt Mossberg. What followed was a masterclass in both company and product management. The whole interview is worth watching, but I thought there were a few segments that stood out.
Caveat:
Any time someone talks about a tech-titan, there’s reflexive blowback from parts of the tech community: “He wasn’t really an engineer”, “He wasn’t really...” - This post will ignore all of that. Even if you strongly dislike him, there are lessons to be learnt here.

Let’s begin...

What matters most:
The interview starts with Kara and Walt congratulating Jobs, because Apple had just bypassed Microsoft in Market Capitalization. Right out of the gate, Jobs makes it clear:

It’s surreal to anyone who knows the history, but:
Jobs: It doesn’t matter very much... it’s not what’s important.. it’s not why any of our customers buy our products.. It’s good for us to keep that in mind, remember what we’re doing and why we doing it.
Even if he is just saying it for the crowds (and I don’t believe he is), he’s making it clear what they are about. He’s making it clear what will be celebrated in the company. It’s not about their market-cap, it’s about making things that their customers love.

On the importance of focus:
Literally the first question (after discussing market-cap) was asking Jobs about their decision to drop Flash, and their “war with Adobe”. His response on Flash alone makes this whole interview worth it. He starts with something subtle:
Jobs: Apple is a company that doesn’t have the most resources of everybody in the world, and the way we’ve succeeded is by choosing what horses to ride really carefully
This is an amazing answer, especially when it follows immediately from the discussion of their market-cap. They were not yet the Trillion Dollar Apple, but they were no slouches. The combination in 2010 of an ascendant Steve Jobs, their market-cap, and the halo around Macs and iPhones was a virtually irresistible magnet for just about any technical staff they wanted. They could have doubled their workforce in an instant and aimed to do everything, but you see a deep, deep understanding from Jobs that just adding people doesn’t mean you can do more great things.

Great products need great people and great focus. Every product manager in the world will voice approval for the Dieter Rams aesthetic and will talk about the importance of saying “no”. Very few people commit to it (and fewer still stand by their guns when people start calling them crazy for it).

It would have been trivial for Apple at that point to hire a team or two to shoehorn a so-so version of Flash onto their devices. (This seemed like an even bigger mistake on the iPad which had just launched). But he is clear and adamant. They are going to make calls they believe are best to shape great products.

On “Courage” (and choosing):
Mossberg pushes with a question many of us had at the time:
Mossberg: What if ppl say the iPad is crippled in this respect?

Jobs: Things are packages of emphasis...
Some things are emphasized in a product, some things are not done as well in a product, somethings are chosen not to be done at all in a product, and so different people make different choices and if the market tells us we making the wrong choices we listen to the market, we’re just people running this company, we trying to make great products for people, and we have at least the courage of our convictions to say we don’t think this is part of what makes a great product, we going to leave it out
...
We’re going to take the heat, because we want to make the best product in the world for customers.
It’s become something of a running joke when Apple-today talks about “courage”, but it really does take courage to swim against the tide this way. It takes courage to make a choice instead of simply adding a toggle and making the customer choose. It takes courage to look at a feature that lots of people “want” and make the call to exclude it. (This seems obvious and easy to sell, but suffers from the same perverse-incentive problem we see in infosec: excluding something gets you immediate, loud feedback but you seldom, if ever, get feedback that you did this right. It’s why people default to throwing in everything, and it’s why we end up with generic, unremarkable software.)

On putting in the hours:
There was an interesting discussion on the relatively new phenomena of Jobs mailing people. They were discussing a specific event (Steve Jobs to Valleywag at 2:20 AM: “Why are you so bitter?” | VentureBeat), but I think again, it was revealing and insightful.
There’s been a huge push in recent times for balance, wellness and rest. People are quick to point out how burning the midnight oil translates to diminished capacity and actually increases your rate of errors, but I’ve never seen consistently great work from people unless they too, were working deep into the night on projects they believe in. 

It’s super telling that this 55 year old billionaire would be “working on a presentation he was giving” at 02h00 in the morning. It is probably possible to do great things without burning the midnight oil, I’ve just never seen it personally, or been able to do it that way myself.

Why would a CEO need to work that hard anyway?
Dug Song often laments the lack of CEO’s who can demo their own products. Any random 10 minutes of the interview should convince you that Jobs is smart, but it’s worth paying attention to the depth of his understanding on almost every topic raised.

When the topic of Foxconn comes up you see him rattle off details on the foreign factory that you wouldn’t expect him to have at his fingertips:Suicide rates per capita, details of their investigations. He then effortlessly swaps to phone market share splits and technical details on Adobe products. Swap again into mobile-ads and he’s talking about Flurry-Analytics by name and describing his hatred of mobile ads and what’s wrong with them. Swap again and he’s talking browser share and how Open-Sourcing WebKit might make inroads against IE (WebKit powering Chrome went on to overtake IE in (2012)).

I’ve no doubt he had teams of smart people driving teams of smart people, but it's also clear that he was deeply involved with every aspect that ends up touching a customer. It demands obsession, and it demands time, and I’d hypothesise that one of the main reasons we are able to talk about his depth of involvement, is because we've already discussed him "putting in the hours”.

On taking yourself too seriously.
It’s clear that Jobs is focused (and a little tightly wound). Stories abound of terrifying elevator rides with him, but at the quarter-mark of the interview, he gives us a tasty morsel.  Mossberg begins by saying that Jobs spent a good chunk of his career fighting a platform war with Microsoft. Jobs actually doesn’t concede this.
In a moment of self-deprecation/levity he quips “We never saw ourselves as in a platform war with Microsoft and maybe that’s why we lost”. It’s a quick flash of him taking himself less-seriously and it’s disarming.

Of course he immediately gets back onto his core message:
Jobs: We always thought about how can we build a much more better product than them, and I think that’s still how we think about it
Nobody can listen to this and have doubts as to what’s their North Star.

On the Consumer market vs The Enterprise market:
In 2008 (two years earlier), Google launched its Android phone and started competing with Apple on mobile. Swisher asked if Apple planned to remove Google apps from the iPhone? 
The Jobs answer is almost predictably monotonous:
Jobs: No... we want to build a better product than they do, and we do...
That’s what we are about
He goes further though, and opens up an interesting topic: the difference between making/selling products in the consumer and enterprise spaces:

Jobs: We are about making better products, and what I love about the consumer market that I always hated about the enterprise market, is that: we come up with a product, we try to tell everybody about it and every person votes for themselves. They go yes or no, and if enough of them say yes, we get to come to work tomorrow. You know that’s how it works. It’s really simple.
Whereas with the enterprise market, it’s not so simple, the people that use the products don’t decide for themselves, the people that make those decisions sometimes are confused.
We love just trying to make the best products in the world for ppl and having them tell us by how they vote with their wallets whether we’re on track or not.
This was always a horrible trend and it's why organizations have suffered for so long with horrible enterprise software.

Ten years later, we know this is changing, and it's devices like the iPhone and iPad proliferating into the enterprise that spurred this change. At Thinkst Canary, we are huge beneficiaries of it. We build a cyber security product that’s used by some of the biggest names on the internet, but our look and feel is as “consumer” as it gets.

When a16z’s Martin Casado spoke about the latest trends in good guys favour in Cyber Security last year, he touched on our Canaries, and pointed out that while we are used by some of the best names in the valley, we are simple enough for him to install us at home.
Casado: Every sophisticated logo that you know use these. They have over 700 customers and yet they are simple enough for you and I to buy over the web and install. So it just goes to show you to what level sophisticated security is being consumerized (and you should know about it)
We think Jobs called it absolutely right in 2010, and are pretty convinced that those ugly enterprise-walls are crumbling… 

On putting things back on the shelf:
One of the highlights of the interview (for me) was Jobs’ discussion on what came first: the iPhone or the iPad?. Until this point, it was always assumed that Apple built the iPhone, saw success, and then said: “let's make another one, but bigger”.

The truth holds more lessons for us:
Mossberg: Did you consider doing a tablet when you did the iPhone?

Jobs: I’ll actually tell you kind of a secret, I actually started on the tablet first.
I had this idea of getting rid of the keyboard and  type on a multitouch glass display...and I asked our folks could we come up with a multitouch display that we could type on that I could rest my hands on and about 6 months later they called me in and showed me this prototype display and it was amazing and.. I gave it to one of our brilliant UI folks and he called me back a few weeks later and he had inertial scrolling working and a few other things. Now we were thinking of building a phone at that time and when I saw that rubber band inertial scrolling and a few of the other things, I said “my god.. we can build a phone out of this”. I put the tablet project on the shelf because the phone was more important and we took the next several years and did the iPhone.and when we got our wind back, and thought we could take on something next, we took the tablet off the shelf, took everything we learned from the phone, and went back to work on the tablet.
So.much.goodness.

The first thing worth noting is that they didn’t rush out a tablet as soon as they could. They knew there was a need, but they were doing the work to figure out if they could fill that need. 

Microsoft fans will often point out how they had a Windows tablet before an iPad, and a Windows smartphone before an iPhone, but both will have to admit that those devices took the PC, and attempted to shrink it down to smaller form factors. So you could totally run MS Excel on your Windows phone, if you could use a stylus to give you the pin-point accuracy of your mouse pointer...

This is not an argument against launching early and iterating, but it does remind us that like all aphorisms, it misses some nuance. Apple wants a tablet, they see its promise, but they believe there’s a few problems that need solving before their MVP is V enough.

We also see Apple sticking to their guns. It’s easy to say “we say ‘no’ to things” when you are striking off a bunch of mediocre ideas. Saying  no to the tablet you really want to build (to focus on the Phone you think you can build now), is where you see this choice play out for real. (That I’m writing this on an iPad while many of you are reading it on an iPhone is the next order effect of making those choices).

On gratitude:
We have a pretty great record as a young software company. Our NPS scores are consistently north of 70 and we have a page on our website dedicated to unsolicited feedback from customers who talk about loving Canary. A fair bit of that feedback notes that people have loved interacting with other parts of our business: Support; Admin; even billing! 

We used to find it odd, that people found it odd! 

To us, it seems like the most natural way for this to be. We know that there’s a zillion products in the marketplace, but someone chose us. We know that on an Internet filled with people saying stuff, our customers have listened to our voice. Every single person in the company is aware of this, and is deeply grateful to our customers for it.

The customer has choices, and the customer chose us. Seen through this lens, gratitude seems like the obvious response but it disappears when customers are seen as just a prop to help you “make the quarter”.

Jobs makes it clear, that this doesn’t have to dull as your market-cap grows:
Jobs: I have one of the best jobs in the world I’m incredibly lucky I thank all of our customers and employees for letting me do what I do, I get to come in and work every morning and I get to hang around some of the most wonderful, brightest committed people I’ve ever met in my life and together we get to play in the best sandbox I’ve ever seen and try to build great products for people.
Market-share, and profit totally matter because they let us go on doing what we love. They are a tangible feedback of customers liking what we are building and can be a proxy for customer happiness, but it's super important for us to make sure we don’t fall foul of Goodhart's law, making the measure our actual target instead.

On Humility and Hubris:
Jobs was often pilloried for telling people they were wrong. Just months earlier Apple went through Antenna-Gate where it was discovered that their new iPhone dropped calls when gripped a certain way. He famously quipped that people didn’t know what they wanted till you showed it to them. But several times during the interview he makes it clear that this doesn’t translate to ignoring your users forever: You do your homework, you make your choices, and then you pay attention to see how your users respond.

For a person with so many phenomenal products under his belt, he also doesn’t assume they are infallible. A few times during the talk he makes it clear, “we are just humans trying our best to do what we think people will love”. Someone uses the Q&A session to remind him that sharing content between devices was still painful. He’s quick to point out:
Jobs: we need to work harder on that. We need to do better. We’re working on it.

Product design (and running a company) bring Jim Collins’ “the Genius of the AND” to the fore as much as anything. You need the courage to make choices, AND you need to be able to listen to customers and use their feedback. It’s just the nature of the beast.

On pragmatism and “seeing the whole product”:
In one of his great twitter threads on the difference between Good Product Managers and Great Product managers, @Shreyas Doshi says:
https://twitter.com/shreyas/status/1249039984071344128?s=21
A clear Apple/Jobs fan implores him to save TV. “You did it with the iPhone, now save TV”. Jobs gives an answer that encapsulates so many of the thoughts discussed earlier.
Jobs: The problem with the television market, the problem with innovation in the television industry is the go-to-market strategy. The TV industry fundamentally has a subsidized business model that gives everybody a set top box for free, or for $10 a month and that pretty much squashes any opportunity for innovation because nobody’s willing to buy a set top box. Ask Tivo, ReplayTV, Roku, ask Vudu, ask Google in a few months.

Sony’s tried as well, Panasonic’s tried, lots of people have tried, they all failed.

So all you can do, … is go back to square 1, redesign the set top box, with a consistent UI, across all these different functions, and get that to the consumer in a way that they are willing to pay for it.

And right now there’s no way to do that. So that’s the problem with the TV market. 

We decided what product do we want the most, better TV or a better phone? Well the phone won out, but there was no chance to do a better TV because there was no way to get it to market.
What do we want more, a tablet or a better TV? Well probably a tablet, but it doesn’t matter because if we wanted a better TV, there’s no way to get it to market.

The TV is going to lose until there’s a better, until there is a viable go-to-market strategy. Otherwise you just making another TiVo. That’s the fundamental problem. It’s not a problem of technology, it’s not a problem of vision, it’s a fundamental go-to-market problem.
Question: In the phone area u were able to recreate that go-to-market strategy by working with the carrier, so does it make sense to partner with the cable operator.
Jobs: Well then you run into another problem, there isn’t a cable operator that’s national. There’s a bunch of cable operators and then there’s not a GSM standard where you build a phone in the US and it works in all these other countries, no, every single country has different standards and different government approvals . It’s very [...] Balkanized, but when we say AppleTV is a hobby, that’s why we use that phrase.
The market place is filled with products that were built because they technically could be. Notice that he isn’t just talking about building a good product, he’s considering how they will stock it, how they will distribute it. They know it's an area that has some promise (which is why they maintain their Apple-TV hobby) but he has no delusions that because they are Apple they can make a channel where one does not yet exist. (You see this in sharp relief when he shares a stage with Bill Gates and is asked questions on the future of computing. While Gates happily talks of 3d VR interfaces, Jobs is more grounded, considering what people are used to and how they will operate it).

In “The Innovation Stack”, Square’s co-founder Jim McKelvey talks about how he struggled to find mentors till he realised that he didn't need to confine himself to people he could call on the phone.
Jobs produced a string of hits across a raft of products and managed to generate such maniacal love that his top customers almost define the term “fan-boys”. If we want to build great products, it would be smart to take a closer look at how he did it.

Ben Horowitz often starts his posts with a choice phrase from a contemporary hip-hop song. I’m going to go the other way, ending with with a line from Kenny Roger’s “The Gambler”.

“And somewhere in the darkness
The gambler he broke even
But in his final words
I found an ace that I could keep”


Good UNIX tools

aka:  Small things done well 

We spend a lot of time sweating the details when we build Canary. From our user flows to our dialogues, we try hard to make sure that there’s very few opportunities for users to be stuck or confused.

We also never add features just because they sound cool.
Do you “explode malware”? No. 
Export to STYX? No. 
Darknet AI IOCs? No. No. No.. 

Vendors add rafts of “check-list-development” features as a crutch. They hope that one more integration (or one more buzz-word) can help make the sale. This is why enterprise software looks like it does, and why it’s probably the most insecure software on your network.

This also leads to a complete lack of focus. To quote industry curmudgeon (and all around smartypants) Kelly Shortridge: "it is better to whole-ass one thing than to half-ass many". We feel this deeply.

Most of us cut our teeth on UNIX and UNIX clones and cling pretty fastidiously to the original Unix philosophies¹:
  • Make each program do one thing well
  • Expect the output of every program to become the input to another
This is pretty unusual for modern security software. Everybody wants to be your “single pane of glass”. Everybody wants to be a platform.

We don’t. 

Tired: Vendors trying to be an island.
Wired: Vendors who work well together.
Inspired: Let’s get ready to Rumble...

Rumble, HD Moore’s take on network discovery, shares a similar perspective, and provides effortless network inventory visibility without credentials, tap ports, or heavy appliances. Rumble tries to provide the best inventory possible through a single agent and a light (but smart) network scan that is safe to run in nearly any environment. (If you are a fan of the quick deployment and light touch of Canaries, you should check out Rumble’s similar approach to network asset inventory!)

It's fast, It has a free tier, and now It integrates with your Canary Console too.

To illustrate this integration, assume someone reaches out to a (fake) Windows Server called \\BackupFS1 and copies \\Salaries\2020\Exco-Salaries.xlsx. Your Canary will send a single, high fidelity message to let you know that skullduggery is afoot. We can tell you that host-192.168.2.136 accessed the Canary, and that AcmeCorp/Bob (or his creds) accessed the share. 


We give you some details on the attacker, but what if you were also running a Rumble inventory of this network? Well then we can simply hand you over.

From June, Canary customers who are also running Rumble, will notice a new integration option under their Flocks Settings. 

    
Rumble Integration Settings

Once this is turned on, IP Addresses in alerts will include a quick link that allows you to investigate the address inside of Rumble.


The integration is light and non-obtrusive, but should immediately add value. It also affords us a touch for a slight flourish. It’s possible that you could use both Canary and Rumble, and never visit the settings page to enable the feature. We have users with hundreds of birds who only visit their Console once or twice a year (when there’s an actual alert). It’s ok. We got you!


The Canary Console will automatically detect if you have a valid Rumble login, and if you do, will enable the integration to show you the link². You won’t have to think about it, it will “just work”.



____________

¹ https://archive.org/details/bstj57-6-1899/page/n3/mode/2up

² If you hate this, you can stop it from happening by setting the integration to “never” in your settings.


Why control matters

In March we moved from Groove to Zendesk - with this migration our Knowledge Base (KB) moved also.

The challenge we faced was name-spacing - KB articles hosted on Groove were in the name-space  http://help.canary.tools/knowledge_base/topics/, but the namespace /knowledge* is reserved on Zendesk and is not available for our use. This forced us to migrate all KB pages to new URLs and update the cross-references between articles.  This addressed the user experience when one lands at our KB portal  by clicking a valid URL or when typing https://help.canary.tools in a browser.

What isn’t resolved though, is thousands of Canaries in the field that have URLs now pointing to the old namespace. We design Canary to be dead simple, but realise that users may sometimes look for assistance. To this end, the devices will often offer simple “What is this?” links in the interface that will lead a user to a discussion on the feature.

With the move (and with Zendesk stealing the namespace), a customer who clicked on one of those links would get an amateurish, uninformative white screen saying “Not Found”.

This is a terrible customer experience! 

The obvious  way forward is a redirect mechanism which maps https://help.canary.tools/knowledge_base/* URLs to the Zendesk name-space. By implication - the DNS entry help.canary.tools cannot point directly to Zendesk; it needs to point to a system that is less opaque to us so can we configure it at will.


That’s straight-forward! On a fuzzy-match we should have something up and running in minutes with AWS CloudFront . This allows us to map name-spaces from https://help.canary.tools/* to https://thinkst.zendesk.com/* with minimal effort.

Step 1:
URL: https://help.canary.tools/some/uri
GET /some/uri HTTP/1.1

Step 2:
URL: https://thinkst.zendesk.com/hc/en-gb/some/uri
GET /hc/en-gb/some/uri HTTP1.1

The next step is to intercept request to the /knowledge_base name-space, and return an HTTP/301 redirect to the correct URL in  Zendesk.  We make use of the Lambda@Edge functionality to implement a request handler.


30 minutes later and few lines of Python it seemed like we had it all figured out but for one not-so-minor detail - images in KB articles weren’t loading. What is going on?

The DNS record help.canary.tools was pointing to CloudFront while the origin for the content was configured as thinkst.zendesk.com, so when CloudFront requested an image it got HTTP redirect back to itself causing an Infinite redirect loop.

Surely this is fixable by adding the correct Host: canary.tools header to the request ? Nope! Instead of a redirect, now we were getting a 403 from CloudFlare (N.B NOT CloudFront) - Zendesk uses CloudFlare for its own content delivery. WTF?!?

After a few iterations the “magic” (read: plain obvious) incantation was discovered. Note 104.16.55.111 is the IP address behind thinkst.zendesk.com

This is somewhat expected, since Zendesk is configured to think it's serving requests for help.canary.tools.

Without this option Zendesk rewrites all relative URIs in the KB to yet another name-space: https://thinkst.zendesk.com/* which brought its own set of challenges, complexity and non-deterministic behavior.

To avoid confusion and further issues down the line we imposed a design constraint on ourselves - a simplifying assumption: the browser’s address bar should only ever display help.canary.tools - the  thinkst.zendesk.com name-space should never leak to customers.

Committed to this approach, the next hurdle we faced was Server Name Indication (SNI). 

Server Name Indication (SNI) is an extension to the Transport Layer Security (TLS) computer networking protocol by which a client indicates which hostname it is attempting to connect to at the start of the handshaking process. This allows a server to present multiple certificates on the same IP address and TCP port number and hence allows multiple secure (HTTPS) websites (or any other service over TLS) to be served by the same IP address without requiring all those sites to use the same certificate.

CloudFront was doing exactly what it was configured to do. It connected to (and negotiated SNI for) thinkst.zendesk.com which resulted in a 403 error because Zendesk is configured for SNI help.canary.tools.

For any of this to work, what CloudFront needed to do was connect to  thinkst.zendesk.com ( 104.16.55.111 ), but negotiate SNI for help.canary.tools. By any other name - we needed “SNI spoofing” (not really a thing - I just coined the phrase).

Can CloudFront do that? No, it can’t :_(   And just like that we had to rethink our approach - CloudFront was not the solution.

Another failed approach was setting the Host mapping field in Zendesk to kb.canary.tools, and it may have worked but for a bug in Zendesk which fails to re-generate SSL certificates when the Host mapping field is updated in their admin console, so browsing to https://kb.canary.tools was met with certificate validation errors. How long does it take for Zendesk to rotate certificates? We don't know (but it's more than 30 minutes).

There were just too many moving parts in too many systems to allow us sanely and consistently reason about customer experience.

Retrospectively, the root-cause of all our problems was still related to name-spacing: both CloudFront and Zendesk (rightfully) believed they are authoritative for the hostname help.canary.tools

  • From the perspective of the entire  Internet help.canary.tools points to CloudFront.
  • From the perspective of CloudFront - help.canary.tools points to Zendesk.
So if both systems share the same name in public, how do the two systems address each other in private?
The answer was some form of Split-Horizon DNS. The least sophisticated version would've been to simply hack the /etc/hosts file on the host serving requests for help.canary.tools, but this functionality exists natively in Nginx's upstream{} block. Of course, those IP addresses could change, but this is a manageable risk that can be remediated in minutes. In contrast round-trip times on tickets to Zendesk are measured in days.

The proxy_ssl_server_name option enables SNI., and the kb_uri variable uses http_map_mod for performing lookups/rewrites on URLs in https://help.canary.tools/knowledge_base/* name-space.

In the end, the Nginx configuration necessary to address our needs was as simple as this:

map $request_uri $kb_uri {
     default "";
     /knowledge_base/topics/url1 /hc/en-gb/articles/360002426777
     ....
}
upstream help.canary.tools {
# dig help.canary.tools
server 104.16.51.111:443;
server 104.16.52.111:443;
server 104.16.53.111:443;
server 104.16.54.111:443;
server 104.16.55.111:443;
}
server{
listen 443 ssl;
server_name help.canary.tools;
location / {
proxy_pass https://help.canary.tools/;
proxy_ssl_server_name on;
}
location /knowledge_base/ {
if ($kb_uri != ""){
return 301 https://help.canary.tools$kb_uri;
}
return 302 https://help.canary.tools;
}
}

Where are we now?

https://help.canary.tools is now Nginx running on EC2.  It's all Dockerized and Terraformed so the configuration and deployment is reproducible in minutes. 



Nginx SSL certificate renewals and refreshes are automated using CertBot (thanks to this guide). Down the line we can add mod-security giving us a level of visibility into potential attacks against - this level of visibility is unfathomable even if CloudWatch was a viable solution.

Using Docker's native support for AWS CloudWatch all Nginx access logs land up in CloudWatch which  gives us dashboarding, metrics and alarming for free.


We now get alerted every time a customer attempts to access a missing URL in the https://help.canary.tools/knowledge_base/*. Mean while, the customer doesn't get an ugly "Not found" error message - they are redirected to our Knowledge Base home page where they can simply use the search function. This has already paid dividends in helping us rectify missing mappings.


From CloudWatch we can directly drill down into Nginx access logs to examine any anomalous behavior.



This is a stark contrast from the world where the application layer was opaque to us - bad user experiences and broken links would have gone completely unnoticed.

Control matters. This is why.