• On SolarWinds, Supply Chains and Enterprise Networks

    The recent SolarWinds incident has managed to grab headlines outside of our security ecosystem. The many (many) headlines and columns inches dedicated to the event are testament to the security worries that continue to reverberate around the globe.  But we think that most of these articles have buried the lede. 

    Most discussions take the position that our enterprises are horribly exposed because of supply chain issues and that any network running SolarWinds should consider themselves compromised. 

    We think it’s actually more dire than that (and suspect it’s going to get worse). Let us lay out the case for why SolarWinds should concern you even if their tools are nowhere near your networks.

    It’s easy to whip up a think-piece in the wake of a public security incident, especially as a vendor. The multitude of vendor mails riding the SolarWinds incident are overflowing our inboxes. But even a stopped clock is right twice a day, and this is one of those times.

    An abstracted, low resolution summary for those (very few) who haven’t paid attention to the incident:

    • SolarWinds make a network management product called Orion, which is deployed on tens of thousands of organisations worldwide;
    • Attackers broke into SolarWinds and made their way to the SolarWinds build environment;
    • They compromised the build pipelines, to inject malicious code into the SolarWinds update process;
    • Organisations  all over the world updated themselves with this poisoned update;
    • (Now) Compromised SolarWinds servers worldwide attacked internal networks of selected organizations;
    • Almost nobody discovered any of this for months till a security company discovered their own compromise.
    The technique of compromising a single source, which then updates other nodes isn’t novel. As recently as yesterday we saw headlines like: “Barcode Scanner app on Google Play infects 10 million users with one update” and indeed this was how Will Smith and Jeff Goldblum saved us when the aliens first made contact.
    The attack gets called a “supply chain” attack which hints at war-time tactics and I’m willing to bet, will launch a dozen cyber security / resilience startups. People are (rightfully) worried about the knock-on effect, since the SolarWinds attackers had access to several other software product companies and could have poisoned those wells too. This is definitely scary! But hear me out. It’s actually a little bit worse than you might think.

    Why it’s actually worse than we think

    The state of enterprise security: While we’ve made progress in some areas of information security (e.g. the degree of knowledge and skill required to exploit memory corruption bugs in modern OSs) , enterprise security is still stuck pretty firmly in the early 2000’s. An enterprise network consists of an untold number of disparate products, loosely coupled through poorly documented interfaces where often the standard for product integration is “this config works, don’t touch it”. Any moderately skilled attacker will decimate an internal corporate network long before they are discovered, and the average time it takes to gain Domain Admin is measured in hours and days instead of weeks or months.
    Most organizations don’t know this though. They know they spend money on security and they know they see charts tracking progress. Most have no clue that faced with an average attacker of moderate skill, they’d almost certainly come off second best.
    Enterprise Products: Even ignoring the weakness that comes with cobbling together many products (security at the joints), most enterprise products won’t hold up very well to serious security testing. Heavyweight vendors like Adobe and Microsoft were publicly spanked into upping their game years ago, but it drops off pretty steeply after them. There’s an interesting carveout for online SaaS companies who have to build security competency since they run their own infrastructure and compromising their products is the same as compromising them. But for products installed into an Enterprise network the incentives are horribly misaligned. Owning, say, Symantec’s antivirus agent doesn’t compromise Symantec, it compromises you (who are running it) and this separation makes all the difference.
    Enterprise networks have too many moving parts: The past few years have seen creative hackers exploit software in places that we never knew were running software. The Thunderstrike crew ran code on Apple video adaptors. Ang Cui has run code on monitors, and office phones. Bunny and xobs ran code on SD-cards and a number of people have now run Linux on hard drive controllers. This makes it clear that the average office network is connected to dozens and dozens of types of devices that wont ever make it into a regular audit, that are nonetheless capable of hiding attackers and injecting badness into your network. 
    3rd Party Risk evaluations:  The joke going around after the incident was that SolarWinds had negatively impacted hundreds of organisations, but definitely passed their 3rd party risk evaluations. It’s slightly unfair, but also true. We don’t have a good way for most organizations to test software like this, and 3rd party questionnaires have always been a weak substitute. Even if we could tell if a product was meeting a minimum security bar (using safe patterns, avoiding unsafe calls, using compile time safety nets, and so on) auto-updates mean that tomorrow’s version of the product might not be the product you tested today. And if the vendor doesn’t know when they are compromised, then they probably won’t know when their update mechanism is used to convert their product into an attacker’s proxy.
    (Please note: We aren’t saying that auto-updates are bad. We believe they solve important problems and we make use of it in our product, but they do introduce a new set of variables that need to be considered. We discussed it in more detail in a previous post of ours: “If i run your software, can you hack me?”)
    The current focus on “supply chain” security will no doubt birth a bunch of companies claiming to solve the problem, but this part of the problem seems intractable. There’s the “easy” suit of software you know about: applications installed on your infrastructure, their dependencies, and so on. But for one, this ignores your vendor’s own vendors. In addition, what product is going to provide guidance on the provenance of the code running in your monitors (on processors we didn’t even know were there?). Will we examine the firmware on the microphone that people are now using for their Zoom calls? Will we re-examine it post its update? There are just too many connected pieces of code to tackle the problem from this angle.
    Enterprise Security Software: Amazingly, if enterprise products as a whole can be classified as insecure, enterprise security products in general are super duper insecure. Dr Mudge warned us in the early 2000s that security products were not necessarily secure products but not enough people took notice.. Many a Veracode report has placed enterprise security products near the bottom of the product pile when tested for security defects. 
    FX famously quipped that “basically by quality level you would be better off defending your network with microsoft word than a checkpoint firewall”. (It’s funny because it’s sad).
    If it takes just hours or days to successfully compromise an internal network, and if the average network has enough hiding places for skilled attackers to burrow deep, what do you think happens when attackers are allowed to move around undetected for months?
    All of these factors have been true for decades and have not visibly led to too many melt-downs. This changes though, because of a kind of “Roger Bannister” effect. Breaking the 4-minute mile seemed impossible till Roger Bannister did it in 1954. Then it was matched repeatedly in fairly quick succession. (Today, several high school runners have conquered the feat). Often people just need to see someone else cross that line. It’s not uncommon to see certain bugs considered unexploitable for a while, only to have the floodgates open up post the first working exploit release.
    When STUXNET made the news in 2010, the result was a global realisation that software exploits could be used to good effect in the real world, but the attack remained fairly magical and esoteric: It targeted centrifuges and involved multiple 0-days and infected Step7 compilers to get manually introduced to the PLCs. The Snowden leaks a few years later however, made it clear that smaller-scale and well targeted exploits could be used to achieve results too. If any country was slow to get into offensive cyber pre-Snowden, very few (who were paying attention) were after that. Governments started tooling up and the commercial industry didn’t hesitate to fill in the gaps.
    When the Ukrainian tax accounting package MEDoc had its update mechanism compromised to deploy malware to its clients, well, the writing was on the wall. Attacking popular vendors as a route into customers was clearly effective and to some actors, squarely on the table.
    A bunch of analysts looking at the SolarWinds incident point out (correctly) that compromised SolarWinds servers were on so many networks that the ripples of this attack could be crazily exponential. What this analysis misses is that the average enterprise runs dozens and dozens of SolarWinds-look-alikes too.
    Ransomware didn’t spring up overnight. Networks hit by ransomware were typically vulnerable for years and ran along blissfully unaware of it till attackers evolved a method to take advantage of it. Most enterprises have been completely vulnerable to their vendors’ horrible insecurity too, the SolarWinds incident just published a blueprint for how to abuse it.
    The situation is dire not because we are fighting some fundamental laws of physics, but because we’ve deluded ourselves for a long time. If there’s a silver lining out of this, it’s that customers will hopefully demand more from their vendors. Proof that they’ve gone through more than compliance checklists and proof that they’d have a shot at knowing when they were compromised. That more enterprises will ask “how would we fare if those boxes in the corner turned evil? Would we even know?”
    Ps. We’ve written previously about how we think about security as a vendor [here]
    Pps. We build Thinkst Canary, a quick, low effort, high fidelity way to detect badness on your network. We didn’t write this article because we built Canary. We build Canary, because we believe what we’ve written in this article…
  • Hackweek 2020

    Because we can

    One of our great pleasures and privileges at Thinkst is that every year we set aside a full week for pure hacking/building. The goals for our “Hackweek” are straightforward: build stuff while learning new things. Last week was the 2020 Hackweek work-from-home edition, and this post is a report on how it went.

    Now in its the fourth year, our Hackweek has come to serve as a kind of a capstone to our year, and folks start thinking about their projects months in advance. The previous editions produced some truly awesome projects, and topping would be was a serious challenge. Without question this has been our finest so far.

    We run Hackweek for multiple reasons. We’re a company of tinkerers and builders, and dedicating time towards scratching that itch just feels right to us. Of course, there’s sometimes downstream benefits to the Thinkst, either in terms of the projects folks worked on, or skills they’ve picked up. (Replacing our Redmine with Phabricator was a project in Hackweek ’17 that brought us much value and is still in use.) But that’s a pleasant side-effect, and not the objective. A key underpinning to Hackweek is that the projects don’t need to be related to Canary or other work projects. When we say “build something”, it can literally be anything and some folks steered far from tech (as we’ll see shortly). We want folks to continually learn, and this sets the tone. While we provide training through the year for topics in our day-to-day work, Hackweek gives the team a chance to stretch themselves in directions they hadn’t previously considered.

    Hackweek format

    The structure of the Hackweek is that on Monday we kick-off, and on Friday afternoon everyone demos their project. Following that, we vote on projects in three separate categories:

    • Most Joyful
    • Most Useful
    • Most Hacky

    The progression of Hackweek over the years tracks well with the team growth we’ve seen at Thinkst. In first few editions, an afternoon was more than sufficient for all the demos, but we had 20 projects this year and that’s tricky to squeeze in. It’s apparent that a rethink is needed for the next edition. Nice problems to have!

    The three winning projects

    The prizes are secondary to the aim of the week, and mostly provide a fun incentive for folks to aim in different directions. Here’s a run through of the winning projects, plus a report on the others below.

    Most Hacky

    Jay and Max decided that their years of gaming experience weren’t enough of an edge when playing Counter-Strike: Global Offensive. To make up the gap, they created a series of game hacks for CS:GO. Their hacks run as a separate program which accesses the CS:GO game’s memory, and changes values on the fly. Hit a key shortcut, and other players become visible through the walls. Hit another shortcut and the crosshairs snap onto the nearest enemy’s head to get a guaranteed headshot every time, even taking into account recoil patterns. Yet another shortcut, and enemies show up on your game radar, so you know wherever they are. No fair!

    See-through walls? Sure, why not.

    Most Joyful

    Louise taught herself how to crochet this week, starting from scratch. Crocheting has a bunch of technical details in how the knots are tied, the different patterns, and putting them together to produce articles. But she didn’t just limit herself to 2D articles, she went all out and produced three separate 3D birds, plus a crocheted Canary device. To top it off, she took them on a hike near Stonehenge for this final shot:

    Early morning birds

    Expect to see more from the “Inyarnis” in our weekly mails.

    Most Useful

    Sherif decided to hit a problem near and dear to his heart. We use Salesforce as a CRM, and for the Customer Support and Success teams, switching to Salesforce to lookup details is a common daily task. But there’s friction in performing this, and he wanted to file down that edge. Slack is our internal comms tool of choice, and Sherif built a Slackbot which interfaces to Salesforce to assist with querying customer details from directly within Slack. The Support and Success teams are thrilled!

    Slack command to quickly get an overview of a customer

    Great projects

    Here’s a rundown of the other projects.

    Anna created CN-D, a machine to forge signatures (or draw anything in pen). She built a CNC machine, replaced the drill with a pen holder, and figured a workflow to take SVGs to CNC files. With an SVG of someone’s signature or a scan of a written page, she could sign documents as them 🤦‍♀️. She also had it draw our logo in pen. It’s an amazing project to hit in one week.

    Haroon writes “remotely”

    Nick mostly stepped away from tech, and built a wooden arcade cabinet called Birdbox to house a monitor, joysticks, and a RetroPie. However he had one tech addition: a Flappy Bird clone with a Canary theme and Haroon’s voice!

    The logo rounds it off beautifully

    Bradley revisited a topic we’ve looked at previously: how to automatically grab a fingerprint of a production server and produce a Canary configuration which mimics that server. Mimic Rebooted sets up a Canary to imitate a server already live in your environment, to save having to manually configure each detail.

    Generating a Canary configuration by scanning a production machine

    Shereen repaired and repurposed a toy crane to add a remote control function to the previous wired design. Using MicroPython running on two Microbits, she had one drive the crane motors, and the other serve as the remote control, with wireless comms between the two.

    Parts and the finished crane

    Lissa put together a Raspberry-Pi gaming console, her first foray into a Hackweek project and one guaranteed to bring hours of fun.

    Retro-gaming is best gaming

    Matt also took a crack at a carpentry project by building an infinity table. He added a distinctly tech twist by using a bunch of individually addressable LED’s (as opposed to a single LED strip), then wrote a Python-based webserver to set the LED colours!

    This demo had the viewers clamouring for Shopify links

    Todor re-implemented the fundamental Canary functionality by imagining what a “home-use” Canary might look like, where the hardware platform is super lightweight (ESP32), and the bird talks directly to Firebase. He then wrote a mobile apps for receiving the Canary alerts, to build a PoC for a new kind of Canary.

    Alerts direct from bird to phone

    Mike designed a 3-in-1 projector from phones and tablets. It literally had three separate projection lenses, which is some kind of record for projectors. He could wirelessly stream content to the three separate lenses.

    I see your one projector lens, and raise you three! In different directions!

    Benjamin solved a problem which had previously vexed him (and me): some models of Suburus don’t have a temperature gauge but only display a warning light when the oil temperature is too high. So he built a device to plug in to his car’s OBD-II port, grab the temperature measurement, and stream the data via Bluetooth to an app on his phone.

    Homemade temperature monitor

    Az leveraged the T2 chip on his Mac to develop a custom tool for cryptographically signing things with a single tap on the TouchID pad. He targeted two separate actions: the firmware images we produce, and the code we commit.

    Code signing and verification in commit logs

    Deena also attacked Salesforce, and setup a flow so that when new Customer are created in Salesforce, we’re alerted in Slack. This solves a particular problem we see, which is that as new customers are signed up, parts of our org are simply unaware of the flow of new customers. This gives everyone a chance to see who the new logos in our customer stable are.

    New customers in Salesforce show up in Slack

    Yusuf learned the lesson from last year, and set his sights on a manageable problem this time around. However he finished sooner than expected so kept going on other projects 🙂 He built a custom Canary link shortener usable from Slack (expect to see this in Customer mails soon), a voice note app for Slack, and an in-browser video-to-GIF conversion tool leveraging ffmpeg and Wasm.

    CanaryLinker: create short URLs in Slack
    CanaryCaster: send voice notes in Slack
    CanaryGifyfier: Convert videos to GIFs directly in-browser

    Caleb added a new platform to the six we already support for Canary, by producing a Canary that runs on Open Stack. This is still early days, but if the interest is there we can consider adding Open Stack as a supported platform.

    Virtual Canary running on Open Stack

    Keagan built and published a Chrome extension called Re-chord to assist folks in their music practise. It tracks links for music pieces, and will recall them when you want to practise at a later date.

    Tracking your music practise links with Re-chord

    Riaan made a device for discreetly defacing public displays. A Pi-Zero is plugged into the HDMI port of any compatible display. It then polls a public DNS record, and when the trigger value is returned in the DNS response, the Pi-Zero switches the HDMI input to the Pi and plays a video, before switching back to the original display input. He tested on his family, and suitably freaked everyone!

    Surprise Rick!

    Haroon built love.pl (a golang tool without go in its name) and turned his constrained attention to needlework, and produced pillows with the Canary logo on them as a pleasing backdrop for his Zoom calls. Keep an eye out for them next time you vidchat with him!

    So tasteful

    Lastly, I delved into Terraform, Packer, and Saltstack to automated a particular environment we’ve pondered for a little while.

    Wrapping up

    Hackweek was a great success, and as a yardstick for our growth it demonstrated some of the logistics we need to improve on. But that’s a key part of why we do it: growth in Thinkst is dependent on growth in Thinksters, and a learning org is what we are. Onwards to next year!

  • New features aren't Solved Problems

    One of the big disconnects in infosec lies between people who build infosec products and people who end up using them on the ground.

    On the one hand, this manifests as misplaced effort: features that are used once in a product-lifetime get tons of developer-effort, while tiny pieces of friction that will chaff the user daily are ignored as insignificant. On the other, this leaves a swath of problems that are considered “solved” that really aren’t.

    The first problem is why using many security products feels like pulling teeth. This is partially explained by who does what on the development team. The natural division of labor amongst developers means that the super talented developers are working on the hairy-edge-case problems (which by definition are edge-cases) while less experienced developers are thrown at “mundane” / CRUD parts of the system. 
    But most of your users will spend most of their time on those “mundane” parts of the system. It’s those common paths that are most in need of talented re-thinking.
    The second problem is more insidious and is why we have a zillion security products that barely register as speed-bumps to determined attackers / penetration-testers: because shipping the feature is not the same thing as solving the problem.
    I recall reading that one of the most trying parts of swimming the English Channel is the final stage, where one can see the shore but it’s still a long way off. Building challenging features sometimes brings the same kind of pain: you find yourself on the wrong end of the Pareto Principle, with the last 20% requiring 80% of the work. When you add pressure and deadlines to this, it’s easy to see why many features will ship at this point. (Some smart process-optimizer might even get a raise for having maximized output per unit of input).
    The wrinkle though, is that the problem itself might not have been solved. (A big part of that is that real-world problems seldom seem to fall to idealised solutions. We can’t just “assume spherical cows”). 
    A few years back a major retail corporation invested millions of dollars in a popular brand of threat detection appliances. When the retailer was publicly exploited a few months later (by attackers who had roamed the network for months) it turned out that the threat detection devices (and the team behind them) had been raising alerts periodically, but that they were being ignored. The retailer probably replaced the CISO, and the product vendor went away making it clear that it wasn’t their fault.
    But wasn’t it?
    The argument that security product vendors often use is that they weren’t deployed properly, or that their alerts were ignored. Ignoring the fact that this is inexcusably user-hostile, it goes against the axiom made famous by Theodore Levitt. That “People don’t want to buy a quarter-inch drill. They want a quarter-inch hole!
    Customers don’t buy security products because we generate alerts. They buy security products because they want us to stop badness (or catch bad actors). If we are generating ten thousand alerts, and customers can’t separate Alice from Carol, then to butcher analogies, they wanted a quarter-inch hole and we sold them 500 drill-bits and a power cord. (Some assembly required).
    Solving “alerting” isn’t easy. There’s a ton of academic work done on Alert-Fatigue and humans have spent a long time trying to figure out how to protect our flocks without crying wolf. Consider this example from our own Canary product:
    1. We see an attacker brute-forcing a service (let’s say an internal admin interface).
    2. We alert our customer with: “Attacker at IP: tried admin:secret to login to the NewYork-Wan”
    Over the next 5 minutes, the attacker runs through her userlist (or password list) trying 10,000 credential pairs. What do we do?

    There are a bunch of simple options, but in this case, most are pretty sub-optimal. 10k alerts are a pain, but throwing away the information is also silly (It makes a huge difference to me if the attacker is throwing random usernames and passwords at the site or if she is using real usernames and passwords from my Active Directory).

    On our Canary Console, we would generate a single event: “Admin thing being Brute Forced” and we would update the event continually. So you’d have one alert, and when you looked at it, you’d have the extra context you need.

    But how to best handle the SMS alerts? We cant update the SMS with new information the way we can on other channels. We could hold back the alert for a bit, to gain more context and send one summary alert but that’s a dangerous trade-off. Do we delay screaming “fire” so we can be more accurate about the spread of the blaze?
    We could send n-sms’es and then throttle them, but thats pretty arbitrary too. There’s tons of little niggly details to be solved here. (The same is true for sending alerts via Syslog or webhooks).
    A few years ago, Slack posted their flowchart for deciding when and where you get a notification if you’re tagged in a message.

    It’s clear that lots of thought has gone into this relatively straightforward feature. It’s also obvious that all messaging apps will claim to handle “Notifications”.

    You can see here where Pareto thinking back-fires. Generating an alert when you see an event is easy. You can totally ship the feature at that point. It demos well and if your sales team is making sales based on a list of features, they probably want you to move onto a new feature anyway. But this is how we end up here:

    It’s easy to get in a bunch of developers and hand out bonuses for features shipped. It takes a dedication and commitment to solving the problem to keep hacking at the feature until it’s actually useful. 
    All our natural behaviour pushes us towards new features. When I go onto a podcast, the host asks me “What’s new with Canary?” because that’s probably what his listeners want to hear, and a part of me really wants to tell him about all the new stuff we are thinking about because it shows that we are charging forward. But there’s another, more important side to the work we do, and often what I want to talk about is something much less “exciting”, like “We are working really hard to optimize alerting”. It isn’t that catchy but it might be more of the effort we need. 
    Maybe instead of constantly focusing on shiny new features, we all need to focus a little harder on making sure the ones we already built actually work.
  • Small things done well¹

    Bad design is bad

    In 2015 Moxie Marlinspike pointed out that the manual page for GPG is (now) 50% of the novel Fahrenheit 451. Any software whose man page approaches 20 thousand words better have a good excuse, and GPG can only gesture vaguely at decades of questionable design.

    GPG gets a bad rap but it isn’t really much of an outlier. Security software has a long history of crumby, unintuitive interfaces and terrible design choices. A deep dive into the factors behind awfully designed security software isn’t the purpose of today’s blogpost, but suffice it to say there is seldom pressure from the end users. Security software mandated by a security team is often rammed down users’ throats, so it doesn’t bother being pleasant. It’ll sell anyway.

    We’ve worked hard to buck this trend from our first version. It’s one reason why we are one of the few pieces of security software that customers actually talk about in terms of love:


    Recently, we released a major update to our Console. It’s been simmering for ages now, and has some really subtle flavours in the details. We figured we’d highlight a few of our favorite bits (and the thinking around them).

    At the outset it’s worth knowing that we have always designed our Console to not keep our users hemmed in. While many security vendors try hard to be the “single pane of glass” commanding your daily attention, our goal is that you almost never need to login. Ideally, you’d set up your birds on day-1, and never go back until it mattered. 

    Learning through experience

    We rely on a third party accounting tool to help manage quotes, invoices, receipts, and the rest of the financial admin that a good-looking company such our ourselves might require. A while back the accounting software vendor sent a mail announcing a major update that included significant interface changes. They announced this as “good news”, but we treated it like a bad smell:

    We were happy with our accounting package. We didn’t want “big changes” in it. We wanted to build awesome software and not think about our accounting package for 10 seconds more than we had to. A bit of reflection on this experience caused us to reevaluate our new Console design. Was this how our customers would feel too? We were about to change their interactions with us. Did they need this change?

    Everything is Different (not!)

    The new update was built around a handful of key features. Huge amongst them was that we moved to a modern front-end framework and introduced the concept of grouping birds. The path to the final design was long and twisting. We experimented with a number of ideas to group birds (flocks!) and tried prototypes of many of them. We just couldn’t get comfortable with almost all of them.

    In light of the accounting software experience, we went back to the drawing-board (several times) till we could make sure that a user who wasn’t signing up for any of the new functionality would effectively get a Console with almost no changes.

    Instead, they’d be greeted with a familiar layout and just a touch of added colour. This also helped shape our transition plan (for moving customers onto the new interface) and set the boundaries for the rest of the design work.

    Let’s talk specifics

    With the major design considerations out the way, we want to introduce a bunch of the small touches that are impactful for users.

    Cut and paste

    We know that if someone is using the Console, they are probably going to be digging into incidents and passing around information. So we’ve made sure that all the fields you could need are easily copyable (and copy to a sane format). No need to highlight fiddly text and copy it just right. We take that problem away with handy Copy-actions next to each field.

    The “search anything” Search

    When we first built Canary, most of our users had 2-5 birds (and no Canarytokens). Now we have customers with over a 1000 Canaries and hundreds of thousands of Canarytokens. This means that our previous concept of listing Canaries (or tokens) falls away, which is why the new Console features a handy new global search box. Just start typing, and we will hone in on the bird, flock, token, alert or artefact that you want to look at.

    It’s tucked away silently at the top of the screen, but once you get used to it, you will never look back.

    Defenders (should) think in Graphs

    Canary alerts are simple, and one alert is generally enough to let you know you’ve got to cancel your plans for the weekend, but what happens if you walk into 10 of them? Viewing the alerts as a graph allows you to spot at a glance if it’s 10 attackers attacking 1 server, or 1 attacker targeting 2 machines.

    We previously created “Graph View” to address this, but it was a few clicks away and was less polished. That’s now changed with our new and improved graph view.

    With cleaner elements and connections (and some simple animations) the graph view is now easier to use and understand than ever before. We think it’ll be the default view for many customers, giving them a quick glance into exactly what’s happening with their birds. 

    Context and Cards

    Although Flocks give our users finer control over their Console, they potentially add new levels of complexity to the UI. Previously, things were simple. You would click on a bird to see its settings, or click on an alert to view its details. There wasn’t any additional context to hold.

    This changes when you group birds into Flocks. From a UX perspective, the additional context raises thorny issues. For example, if the New York flock is currently active on screen and an alert arrives from a bird in the Mumbai flock, should it be displayed and, if so, where? How do we view settings that apply only to the Cape Town Flock? Keeping that context clear has the potential to add an unpleasant cognitive load on the user.

    If we simply used modals at each step, a user would very quickly get lost (and likely frustrated) in a deck of multiple open modals. We want to give our users these new features, but we don’t want to complicate the UX.

    In order to handle these context switches (from global views to drilling down into a specific Flock and beyond) we use a combination of cards and transitions. Cards here are, essentially, panels which take over the current visual context without anything behind them (ala modals). The cards allow us to encapsulate the current context (where the user is). The transitions allow us to flow into this new context logically, and allows the user to intuitively understand the change.

    Let’s look at clicking on a Flock:

    The clicked flock card slowly transitions from being in a list to now becoming the focal point. The user’s view narrows from the global context, to the Johannesburg Office flock. The card encapsulates the new context and the transition allows the user to mentally follow the switch. No other content is visible behind the card, so the user knows this in only one place. Multiple cards cannot be open at the same time.

    A user clicking on the “Johannesburg Office” flock sees a transition. The flock moves from one of the listed flocks to grabbing focus and then expands to capture all of the user’s attention. It then dominates their view and becomes their new context.

    It was important for us to get this right. We added new features and complexity, but we didn’t want the user to have to pay for it. We needed to put in the effort and take the complexity away from the user.

    Animations aren’t for everyone

    Natural animations play a big part in guiding the users’ understanding of their context in the Console. Clicking on a Flock expands its card to fill the screen, since it’s now the focus of your attention. Want to edit the Flock’s settings? Flip the card around. But it’s possible that once you’ve gotten used to them, you no longer need the animations to give you context. So we’ve taken the unusual step of letting you turn off animations with a toggle. (Yeah, we like them and think they make the app better, but we aren’t going to force them onto a user that sees no use for them.) 


    Along with the new look, we’ve introduced Inyoni. Inyoni, which means bird in Zulu (one of South Africa’s 11 official languages), may be seen hopping around the login form or periodically popping in.

    Inyoni isn’t completely gratuitous and isn’t just about making a cute mascot. He also fills a real need. There are times when you could be on a section of the site where you won’t notice an alert come in. Canaries throw such few alerts, it would be a shame to miss one! So if you are scrolled away from your alert list at the top of the page (and only if you are scrolled away from your alert list), then you will hear a pleasing pop and Inyoni will politely nudge in to let you know that there’s something for you to check out.

    Animated fav icon

    Similar to the Inyoni pop, If a user is busy on another tab and a new alert comes in, we add a small animation to the fav icon. It’s relatively inconspicuous, but adds a heads-up that might be useful. (This is a little trickier than it looks, not all browsers support animated gifs in the favicon).


    Not being satisfied with only creating Inyoni and the favicon animation, Max helped us spruce up our artwork for this release as well. Instead of stock icons, we now use custom artwork he’s painstakingly created. You’ll find these amazing sketches in the Console:

    Domain name slide

    With all of our practical features, we have to confess that this one was added purely because we love how it looks! When a customer logs in, they see their domain name in the top left corner. Instead of it scrolling off the page as you scroll down, it now gradually decreases in size until it tucks up neatly under our logo (which also gently gets nudged up to make space for the text). It’s a little thing, and will likely be missed by most users, but it makes a bunch of us happy every time we see it.

    Email Interactions

    We are strong believers that design isn’t just about how things look, but about how they actually work. We make sure that even our emails to customers try to make things easier. When we send an alert email, the email includes actions that the user can take immediately at the bottom of the mail. (The user can choose to acknowledge the incident or add the source IP to the ignore list, for example.) Now most users might not even notice, but we do a bit more work on the back-end, so a user clicking one of these buttons doesn’t actually need to login.

    This removes a few clicks for users, but it makes us incredibly happy knowing we’ve saved them a few clicks.

    We do the same thing with our weekly Console round-up. Every mail includes an unsubscribe link that won’t try to guilt you to stay, and won’t ask for you to explain why you no longer want to receive emails. Click the link, good bye!


    Our view on integrations is that security vendors over-index on 3rd party integrations. “What products do you integrate with?” is a pretty standard sales question so it’s become the norm to tick as many of these boxes as possible.

    This often adds unneeded complexity to the product and just as often, does so without adding extra value. We pride ourselves on a simple product with high fidelity alerts, so we’ve largely avoided this.

    With the new Console, we’ve also included our first 3rd-party integration: Rumble, an ultra-light and quick network discovery tool. We’ve made sure that the integration is light-weight too, and customers who don’t use Rumble (or specifically turn the integration off) won’t ever notice it. Those who do, will be able to quickly query Rumble for more information on IPs seen in Canary.

    Again, this integration “just works”. If you are logged into Rumble, we’ll automatically detect it, and will present you with a lookup link. If you aren’t, you’ll never be bothered by it at all.


    We work hard to delight our customers. Whether it’s through support interactions, our birds, or, in this case, small UI changes that may well go unnoticed. We love them!

    Security tools don’t need to suck. Sucking less actually means that users are more likely to actually use them to their full potential, and in the case of security tools, this is better for us all.


    ¹ With apologies to Rands

  • Something fresh

    This month we’re ready to release our first major Canary Console overhaul. We’ve obviously pushed updates to Canary and the Console weekly for almost 5 years but this is the first time we’ve dramatically reworked the Console.

    Contrary to a bunch of other products, we don’t want to be your single  pane of glass, and work really hard to make sure that most customers never have to spend time in their Console at all. But our beefed up Console offers you a bunch of  fresh possibilities, and we figured we’d introduce some of them here.

    What’s different?

    The first thing that a new user should notice, is that it doesn’t feel that different to the old Console. It has a new coat of paint, and some things look slicker, but it feels like just a slight visual upgrade on the original Console.
    This is completely by design, and belies a bunch of changes beneath the surface. It’s practically a trope that just as users become familiar with a product, the vendor drastically alters the user interface forcing users to re-learn flows which were previously easy. We hate this. Tools are supposed to make our lives easier, not periodically give us pop-quizzes.
    We know that there’s a fine-line between keeping the product familiar, and introducing new features (or refreshing the look). Throughout this process we’ve tried to keep a clear view of this line. We’re really excited to show a few of the enhancements that are deployed to customers as of today.
    From the screenshot above, you can see a few of these right off the bat. 


    The new search-box at the top means that you never have to hunt for things again. When we first built Canary, customer Consoles had 5–10 birds in them (and no Canarytokens). Today, we have Consoles with hundreds of birds and tens of thousands of active Canarytokens. The search-box allows us to find things, even if we aren’t really sure what we are looking for. The search feature will let you find Bird, Incidents, Canarytokens and more without having to hunt, and you can search on pretty much any data tied to them.

    A better graph-view

    We’ve made a heap of improvements to our graph-view, which displays your alerts graphically rather than in table form. This is especially useful if you get a bunch of alerts; a quick click on the graph-view button will immediately clarify if it’s an attack sourced from a single or multiple locations in your network, and show you the birds involved. 

    Birds of a Feather

    The biggest improvement with the new Console is the ability to group birds into flocks. We spent heaps of time making this simple and intuitive so using it should feel pretty natural.
    Once you’ve created a flock, any birds or tokens you previously had enrolled will be sitting in your “Default Flock”. These can be moved over to new flocks if you choose.
    Of course the point (and joy) of having different flocks, is that you can treat them differently, so all of your flocks can have different settings, different users, and even different alerting rules.

    User Management

    Although we’ve supported the ability to add and remove users from your Console for a while, you now have much finer grained control of your users and their permissions. You can add users, restrict them to just a single flock, allow them to only deal with alerts, and delegate managing the actual birds to further users. The permissions model is simple: you can watch flocks, or manage them.


    Canarytokens also gets a refresh in the new Console but it brings a bunch of utility that is deserving of its own post (next week). It’s especially useful to be able to place them in different groups. Shortly we’ll be releasing updates to the Console API alongside helper utilities to make it easier to deploy them by the dozen (or dozen dozen) inside your networks.

    Audit Trail

    If you’re an admin user, you now have access to the Audit Trail, which gives you detailed information on all activities performed on the Console. (You can also download a JSON dump of all activity if needed). The audit trail backend code has been in place for a while, so your audit trail is already populated with a bunch of your activity.

    Support and our Knowledge Base

    We try hard to make sure that Canary is easy to use, and where options need explanations, we usually build this into the app. However it’s still possible for users to lose their way. The Console now includes a constantly visible link to “help” that’s backed by a heavily populated Knowledge Base and a pretty decent search. You can still use the interface to send us a support request, and our helper elves will be super-quick to respond, but the KB should make things easier.

    Email Changes

    The new look also means that things like your weekly newsletter get a much-awaited visual upgrade but some changes are also functional.
    Emails now include single-use buttons, to Acknowledge or Delete alerts (or to add them to your ignore list) which don’t require you to login. (This allows you to react quickly from your phone/mailbox. We really mean it when we say we don’t want to be your single pane of glass.)

    Easy copy & paste

    We’ve added a bunch of convenience functions to make sure that getting data out of the Console is quick and simple. Most data fields have a neatly hidden Copy button you can hit to grab data into your clipboard.


    We already support SSO and you’ve always been able to make use of Duo / Google Authenticator / Authy for MFA. The new Console adds the awesomeness that is WebAuthn to our authentication defense lineup.

    More to come

    There’s  a bunch of other features that we can’t wait to share with you, and in the coming days will release more blogposts. For now, take it for a spin. It should be all Canary: Simple, and easy to use. Drop us a note with your thoughts!
  • 3rd-party API-Key Leaks (and the Broker)


    Continually refining our security operations is part and parcel of what we do at Thinkst Canary to stay current with attacker behaviours. We’ve previously written about how we think about product security (where we referenced earlier pieces on custom nginx allow-listing, sandboxing, or our fleet-wide auditd monitoring).
    Recently we examined our exposure to API key leakage, and the results were unexpected.


    Like most companies, we use a handful of third-party providers for ancillary services. And, like most providers, they expose an API and give us an API key. A short time back as part of an exercise in examining our internal controls relating to third-party API keys we asked:
    • has an attacker grabbed this key?
    • has she actually used this key ?
    • what did she do with this key?
    It turns out that even really popular service providers, by default, provide very crumby answers to these questions. 
    That’s quite a conclusion to reach.
    To be clear, most providers expose their service logs (i.e. what did they do for you), but few expose API logs (what Thinkst’s API key did or attempted on the service). Consider a third-party transactional mailer service which sends emails on your behalf. An API key lets you send emails, but it also lets you query the API to recover previously sent emails (within a time window). With one provider there’s simply no way for us to determine whether our API key has ever been used to retrieve old emails; API logs aren’t available.
    The tl;dr is that if we expect to have answers to these questions, we have to take care of creating those logs ourselves. But how?


    In our current model, we have hundreds of customer Consoles that hold an API key for mail and SMS providers. Both our mail and SMS providers restrict us to a single key, so we end up with all those consoles using a shared key. 
    This isn’t a train smash, but what if an attacker compromised a single customer console?
    With enough privileges, they’d be able to grab those keys and start to query the provider APIs. (It would be great if we could have separate keys, or if we could ask useful questions of the providers). We can’t, so we built the “broker”.


    Conceptually it’s pretty straightforward. The Broker is a proxy which sits between our Consoles and third-party providers. The actual API keys are stored on the Broker, and not the Consoles. So a breached Console cannot reveal valid API keys (because they’re absent). Instead, we generate replacement keys, unique to each Console, as part of our configuration management process.
    As a single go-binary, the broker has a pretty small attack-surface, and logs all of its actions to our ELK stack.
    This position on the hot path between API consumer and provider lets us add a bunch of coolness:
    • We create individual keys for each of our Consoles, even if the provider only gives us one. That means a breach on any Console doesn’t yield access to the third-party API key, and activity can be traced back to the individual Console;
    • We log all interactions. This gives us our audit trail which the API providers are unable or unwilling to expose;
    • We can cycle keys on a penny (and have added this into our configuration process);
    • (Down the line we can even do things like block certain API calls to certain keys).
    This is conceptually similar to Diogo Mónica and Nate McCauleys crypto-anchors (which is a 2014 talk worth watching)..


    An API broker like this isn’t for everyone. We now have an extra service to maintain and there’s a single service holding multiple API keys (but we are pretty happy with the trade-off).
    In terms of scaling, while a proxy is a chokepoint, Canaries don’t generate loads of alerts and we haven’t seen bottlenecks (should that situation arise, load balanced Brokers are possible but not worth the added complexity right now).
    We considered using one of the big API gateways, but decided against it. While they seem to focus on API’s going in the other direction they can be twisted to suit our needs. We just didn’t want a huge, relatively untrusted code-base fulfilling this role for us.


    We’ve been running the Broker internally for a while and every day thousands of emails and SMSs are sent through the little broker that could. We’ll add a little more documentation and will throw it up on our Github account for others to use. If you are interested in playing with it, drop us a note @ThinkstCanary or at research@thinkst.com
  • A Steve Jobs masterclass (from a decade ago)

    A decade ago, Steve Jobs sat down at the D8 conference for an interview with Kara Swisher and Walt Mossberg. What followed was a masterclass in both company and product management. The whole interview is worth watching, but I thought there were a few segments that stood out.


    Any time someone talks about a tech-titan, there’s reflexive blowback from parts of the tech community: “He wasn’t really an engineer”, “He wasn’t really…” – This post will ignore all of that. Even if you strongly dislike him, there are lessons to be learnt here.

    Let’s begin…

    What matters most:

    The interview starts with Kara and Walt congratulating Jobs, because Apple had just bypassed Microsoft in Market Capitalization. Right out of the gate, Jobs makes it clear:

    It’s surreal to anyone who knows the history, but:

    Jobs: It doesn’t matter very much… it’s not what’s important.. it’s not why any of our customers buy our products.. It’s good for us to keep that in mind, remember what we’re doing and why we doing it.

    Even if he is just saying it for the crowds (and I don’t believe he is), he’s making it clear what they are about. He’s making it clear what will be celebrated in the company. It’s not about their market-cap, it’s about making things that their customers love.

    On the importance of focus:

    Literally the first question (after discussing market-cap) was asking Jobs about their decision to drop Flash, and their “war with Adobe”. His response on Flash alone makes this whole interview worth it. He starts with something subtle:

    Jobs: Apple is a company that doesn’t have the most resources of everybody in the world, and the way we’ve succeeded is by choosing what horses to ride really carefully

    This is an amazing answer, especially when it follows immediately from the discussion of their market-cap. They were not yet the Trillion Dollar Apple, but they were no slouches. The combination in 2010 of an ascendant Steve Jobs, their market-cap, and the halo around Macs and iPhones was a virtually irresistible magnet for just about any technical staff they wanted. They could have doubled their workforce in an instant and aimed to do everything, but you see a deep, deep understanding from Jobs that just adding people doesn’t mean you can do more great things.

    Great products need great people and great focus. Every product manager in the world will voice approval for the Dieter Rams aesthetic and will talk about the importance of saying “no”. Very few people commit to it (and fewer still stand by their guns when people start calling them crazy for it).

    It would have been trivial for Apple at that point to hire a team or two to shoehorn a so-so version of Flash onto their devices. (This seemed like an even bigger mistake on the iPad which had just launched). But he is clear and adamant. They are going to make calls they believe are best to shape great products.

    On “Courage” (and choosing):

    Mossberg pushes with a question many of us had at the time:

    Mossberg: What if ppl say the iPad is crippled in this respect?

    Jobs: Things are packages of emphasis…

    Some things are emphasized in a product, some things are not done as well in a product, somethings are chosen not to be done at all in a product, and so different people make different choices and if the market tells us we making the wrong choices we listen to the market, we’re just people running this company, we trying to make great products for people, and we have at least the courage of our convictions to say we don’t think this is part of what makes a great product, we going to leave it out

    We’re going to take the heat, because we want to make the best product in the world for customers.

    It’s become something of a running joke when Apple-today talks about “courage”, but it really does take courage to swim against the tide this way. It takes courage to make a choice instead of simply adding a toggle and making the customer choose. It takes courage to look at a feature that lots of people “want” and make the call to exclude it. (This seems obvious and easy to sell, but suffers from the same perverse-incentive problem we see in infosec: excluding something gets you immediate, loud feedback but you seldom, if ever, get feedback that you did this right. It’s why people default to throwing in everything, and it’s why we end up with generic, unremarkable software.)

    On putting in the hours:

    There was an interesting discussion on the relatively new phenomena of Jobs mailing people. They were discussing a specific event (Steve Jobs to Valleywag at 2:20 AM: “Why are you so bitter?” | VentureBeat), but I think again, it was revealing and insightful.

    There’s been a huge push in recent times for balance, wellness and rest. People are quick to point out how burning the midnight oil translates to diminished capacity and actually increases your rate of errors, but I’ve never seen consistently great work from people unless they too, were working deep into the night on projects they believe in. 

    It’s super telling that this 55 year old billionaire would be “working on a presentation he was giving” at 02h00 in the morning. It is probably possible to do great things without burning the midnight oil, I’ve just never seen it personally, or been able to do it that way myself.

    Why would a CEO need to work that hard anyway?

    Dug Song often laments the lack of CEO’s who can demo their own products. Any random 10 minutes of the interview should convince you that Jobs is smart, but it’s worth paying attention to the depth of his understanding on almost every topic raised.

    When the topic of Foxconn comes up you see him rattle off details on the foreign factory that you wouldn’t expect him to have at his fingertips:Suicide rates per capita, details of their investigations. He then effortlessly swaps to phone market share splits and technical details on Adobe products. Swap again into mobile-ads and he’s talking about Flurry-Analytics by name and describing his hatred of mobile ads and what’s wrong with them. Swap again and he’s talking browser share and how Open-Sourcing WebKit might make inroads against IE (WebKit powering Chrome went on to overtake IE in (2012)).

    I’ve no doubt he had teams of smart people driving teams of smart people, but it’s also clear that he was deeply involved with every aspect that ends up touching a customer. It demands obsession, and it demands time, and I’d hypothesise that one of the main reasons we are able to talk about his depth of involvement, is because we’ve already discussed him “putting in the hours”.

    On taking yourself too seriously.

    It’s clear that Jobs is focused (and a little tightly wound). Stories abound of terrifying elevator rides with him, but at the quarter-mark of the interview, he gives us a tasty morsel.  Mossberg begins by saying that Jobs spent a good chunk of his career fighting a platform war with Microsoft. Jobs actually doesn’t concede this.

    In a moment of self-deprecation/levity he quips “We never saw ourselves as in a platform war with Microsoft and maybe that’s why we lost”. It’s a quick flash of him taking himself less-seriously and it’s disarming.

    Of course he immediately gets back onto his core message:

    Jobs:We always thought about how can we build a much more better product than them, and I think that’s still how we think about it

    Nobody can listen to this and have doubts as to what’s their North Star.

    On the Consumer market vs The Enterprise market:

    In 2008 (two years earlier), Google launched its Android phone and started competing with Apple on mobile. Swisher asked if Apple planned to remove Google apps from the iPhone? 

    The Jobs answer is almost predictably monotonous:

    Jobs: No… we want to build a better product than they do, and we do…

    That’s what we are about

    He goes further though, and opens up an interesting topic: the difference between making/selling products in the consumer and enterprise spaces:

    Jobs: We are about making better products, and what I love about the consumer market that I always hated about the enterprise market, is that: we come up with a product, we try to tell everybody about it and every person votes for themselves. They go yes or no, and if enough of them say yes, we get to come to work tomorrow. You know that’s how it works. It’s really simple.

    Whereas with the enterprise market, it’s not so simple, the people that use the products don’t decide for themselves, the people that make those decisions sometimes are confused.

    We love just trying to make the best products in the world for ppl and having them tell us by how they vote with their wallets whether we’re on track or not.

    This was always a horrible trend and it’s why organizations have suffered for so long with horrible enterprise software.

    Ten years later, we know this is changing, and it’s devices like the iPhone and iPad proliferating into the enterprise that spurred this change. At Thinkst Canary, we are huge beneficiaries of it. We build a cyber security product that’s used by some of the biggest names on the internet, but our look and feel is as “consumer” as it gets.

    When a16z’s Martin Casado spoke about the latest trends in good guys favour in Cyber Security last year, he touched on our Canaries, and pointed out that while we are used by some of the best names in the valley, we are simple enough for him to install us at home.

    Casado: Every sophisticated logo that you know use these. They have over 700 customers and yet they are simple enough for you and I to buy over the web and install. So it just goes to show you to what level sophisticated security is being consumerized (and you should know about it)

    We think Jobs called it absolutely right in 2010, and are pretty convinced that those ugly enterprise-walls are crumbling… 

    On putting things back on the shelf:

    One of the highlights of the interview (for me) was Jobs’ discussion on what came first: the iPhone or the iPad?. Until this point, it was always assumed that Apple built the iPhone, saw success, and then said: “let’s make another one, but bigger”.

    The truth holds more lessons for us:

    Mossberg: Did you consider doing a tablet when you did the iPhone?

    Jobs: I’ll actually tell you kind of a secret, I actually started on the tablet first.

    I had this idea of getting rid of the keyboard and  type on a multitouch glass display…and I asked our folks could we come up with a multitouch display that we could type on that I could rest my hands on and about 6 months later they called me in and showed me this prototype display and it was amazing and.. I gave it to one of our brilliant UI folks and he called me back a few weeks later and he had inertial scrolling working and a few other things. Now we were thinking of building a phone at that time and when I saw that rubber band inertial scrolling and a few of the other things, I said “my god.. we can build a phone out of this”. I put the tablet project on the shelf because the phone was more important and we took the next several years and did the iPhone.and when we got our wind back, and thought we could take on something next, we took the tablet off the shelf, took everything we learned from the phone, and went back to work on the tablet.


    The first thing worth noting is that they didn’t rush out a tablet as soon as they could. They knew there was a need, but they were doing the work to figure out if they could fill that need. 

    Microsoft fans will often point out how they had a Windows tablet before an iPad, and a Windows smartphone before an iPhone, but both will have to admit that those devices took the PC, and attempted to shrink it down to smaller form factors. So you could totally run MS Excel on your Windows phone, if you could use a stylus to give you the pin-point accuracy of your mouse pointer…

    This is not an argument against launching early and iterating, but it does remind us that like all aphorisms, it misses some nuance. Apple wants a tablet, they see its promise, but they believe there’s a few problems that need solving before their MVP is V enough.

    We also see Apple sticking to their guns. It’s easy to say “we say ‘no’ to things” when you are striking off a bunch of mediocre ideas. Saying  no to the tablet you really want to build (to focus on the Phone you think you can build now), is where you see this choice play out for real. (That I’m writing this on an iPad while many of you are reading it on an iPhone is the next order effect of making those choices).

    On gratitude:

    We have a pretty great record as a young software company. Our NPS scores are consistently north of 70 and we have a page on our website dedicated to unsolicited feedback from customers who talk about loving Canary. A fair bit of that feedback notes that people have loved interacting with other parts of our business: Support; Admin; even billing! 

    We used to find it odd, that people found it odd! 

    To us, it seems like the most natural way for this to be. We know that there’s a zillion products in the marketplace, but someone chose us. We know that on an Internet filled with people saying stuff, our customers have listened to our voice. Every single person in the company is aware of this, and is deeply grateful to our customers for it.

    The customer has choices, and the customer chose us. Seen through this lens, gratitude seems like the obvious response but it disappears when customers are seen as just a prop to help you “make the quarter”.

    Jobs makes it clear, that this doesn’t have to dull as your market-cap grows:

    Jobs: I have one of the best jobs in the world I’m incredibly lucky I thank all of our customers and employees for letting me do what I do, I get to come in and work every morning and I get to hang around some of the most wonderful, brightest committed people I’ve ever met in my life and together we get to play in the best sandbox I’ve ever seen and try to build great products for people.

    Market-share, and profit totally matter because they let us go on doing what we love. They are a tangible feedback of customers liking what we are building and can be a proxy for customer happiness, but it’s super important for us to make sure we don’t fall foul of Goodhart’s law, making the measure our actual target instead.

    On Humility and Hubris:

    Jobs was often pilloried for telling people they were wrong. Just months earlier Apple went through Antenna-Gate where it was discovered that their new iPhone dropped calls when gripped a certain way. He famously quipped that people didn’t know what they wanted till you showed it to them. But several times during the interview he makes it clear that this doesn’t translate to ignoring your users forever: You do your homework, you make your choices, and then you pay attention to see how your users respond.

    For a person with so many phenomenal products under his belt, he also doesn’t assume they are infallible. A few times during the talk he makes it clear, “we are just humans trying our best to do what we think people will love”. Someone uses the Q&A session to remind him that sharing content between devices was still painful. He’s quick to point out:

    Jobs: we need to work harder on that. We need to do better. We’re working on it.

    Product design (and running a company) bring Jim Collins’ “the Genius of the AND” to the fore as much as anything. You need the courage to make choices, AND you need to be able to listen to customers and use their feedback. It’s just the nature of the beast.

    On pragmatism and “seeing the whole product”:

    In one of his great twitter threads on the difference between Good Product Managers and Great Product managers, @Shreyas Doshi says:

    Good PMs are detail-obsessed, making sure that the product meets the desired quality bar for launch. Great PMs pay this degree of attention to the entire customer experience: they know that the docs, the API, blog post, website, support emails, etc. are also “the product”.— Shreyas Doshi (@shreyas) April 11, 2020

    A clear Apple/Jobs fan implores him to save TV. “You did it with the iPhone, now save TV”. Jobs gives an answer that encapsulates so many of the thoughts discussed earlier.

    Jobs: The problem with the television market, the problem with innovation in the television industry is the go-to-market strategy. The TV industry fundamentally has a subsidized business model that gives everybody a set top box for free, or for $10 a month and that pretty much squashes any opportunity for innovation because nobody’s willing to buy a set top box. Ask Tivo, ReplayTV, Roku, ask Vudu, ask Google in a few months.

    Sony’s tried as well, Panasonic’s tried, lots of people have tried, they all failed.

    So all you can do, … is go back to square 1, redesign the set top box, with a consistent UI, across all these different functions, and get that to the consumer in a way that they are willing to pay for it.

    And right now there’s no way to do that. So that’s the problem with the TV market. 

    We decided what product do we want the most, better TV or a better phone? Well the phone won out, but there was no chance to do a better TV because there was no way to get it to market.

    What do we want more, a tablet or a better TV? Well probably a tablet, but it doesn’t matter because if we wanted a better TV, there’s no way to get it to market.

    The TV is going to lose until there’s a better, until there is a viable go-to-market strategy. Otherwise you just making another TiVo. That’s the fundamental problem. It’s not a problem of technology, it’s not a problem of vision, it’s a fundamental go-to-market problem.

    Question: In the phone area u were able to recreate that go-to-market strategy by working with the carrier, so does it make sense to partner with the cable operator.

    Jobs: Well then you run into another problem, there isn’t a cable operator that’s national. There’s a bunch of cable operators and then there’s not a GSM standard where you build a phone in the US and it works in all these other countries, no, every single country has different standards and different government approvals . It’s very […] Balkanized, but when we say AppleTV is a hobby, that’s why we use that phrase.

    The market place is filled with products that were built because they technically could be. Notice that he isn’t just talking about building a good product, he’s considering how they will stock it, how they will distribute it. They know it’s an area that has some promise (which is why they maintain their Apple-TV hobby) but he has no delusions that because they are Apple they can make a channel where one does not yet exist. (You see this in sharp relief when he shares a stage with Bill Gates and is asked questions on the future of computing. While Gates happily talks of 3d VR interfaces, Jobs is more grounded, considering what people are used to and how they will operate it).

    In “The Innovation Stack”, Square’s co-founder Jim McKelvey talks about how he struggled to find mentors till he realised that he didn’t need to confine himself to people he could call on the phone.

    Jobs produced a string of hits across a raft of products and managed to generate such maniacal love that his top customers almost define the term “fan-boys”. If we want to build great products, it would be smart to take a closer look at how he did it.

    Ben Horowitz often starts his posts with a choice phrase from a contemporary hip-hop song. I’m going to go the other way, ending with with a line from Kenny Roger’s “The Gambler”.

    “And somewhere in the darkness

    The gambler he broke even

    But in his final words

    I found an ace that I could keep”

  • Good UNIX tools

    aka:  Small things done well 
    We spend a lot of time sweating the details when we build Canary. From our user flows to our dialogues, we try hard to make sure that there’s very few opportunities for users to be stuck or confused.
    We also never add features just because they sound cool.
    Do you “explode malware”? No. 
    Export to STYX? No. 
    Darknet AI IOCs? No. No. No.. 
    Vendors add rafts of “check-list-development” features as a crutch. They hope that one more integration (or one more buzz-word) can help make the sale. This is why enterprise software looks like it does, and why it’s probably the most insecure software on your network.
    This also leads to a complete lack of focus. To quote industry curmudgeon (and all around smartypants) Kelly Shortridge: “it is better to whole-ass one thing than to half-ass many”. We feel this deeply.
    Most of us cut our teeth on UNIX and UNIX clones and cling pretty fastidiously to the original Unix philosophies¹:
    • Make each program do one thing well
    • Expect the output of every program to become the input to another
    This is pretty unusual for modern security software. Everybody wants to be your “single pane of glass”. Everybody wants to be a platform.
    We don’t. 
    Tired: Vendors trying to be an island.
    Wired: Vendors who work well together.
    Inspired: Let’s get ready to Rumble…
    Rumble, HD Moore’s take on network discovery, shares a similar perspective, and provides effortless network inventory visibility without credentials, tap ports, or heavy appliances. Rumble tries to provide the best inventory possible through a single agent and a light (but smart) network scan that is safe to run in nearly any environment. (If you are a fan of the quick deployment and light touch of Canaries, you should check out Rumble’s similar approach to network asset inventory!)
    It’s fast, It has a free tier, and now It integrates with your Canary Console too.
    To illustrate this integration, assume someone reaches out to a (fake) Windows Server called \\BackupFS1 and copies \\Salaries\2020\Exco-Salaries.xlsx. Your Canary will send a single, high fidelity message to let you know that skullduggery is afoot. We can tell you that host- accessed the Canary, and that AcmeCorp/Bob (or his creds) accessed the share. 
    We give you some details on the attacker, but what if you were also running a Rumble inventory of this network? Well then we can simply hand you over.
    From June, Canary customers who are also running Rumble, will notice a new integration option under their Flocks Settings. 
    Rumble Integration Settings
    Once this is turned on, IP Addresses in alerts will include a quick link that allows you to investigate the address inside of Rumble.
    The integration is light and non-obtrusive, but should immediately add value. It also affords us a touch for a slight flourish. It’s possible that you could use both Canary and Rumble, and never visit the settings page to enable the feature. We have users with hundreds of birds who only visit their Console once or twice a year (when there’s an actual alert). It’s ok. We got you!
    The Canary Console will automatically detect if you have a valid Rumble login, and if you do, will enable the integration to show you the link². You won’t have to think about it, it will “just work”.

    ¹ https://archive.org/details/bstj57-6-1899/page/n3/mode/2up

    ² If you hate this, you can stop it from happening by setting the integration to “never” in your settings.

  • Canarytokens.org – Quick, Free, Detection for the Masses


    This is part 2 in a series of posts on our 2015 BlackHat talk, and covers our Canarytokens work.

    You’ll be familiar with web bugs, the transparent images which track when someone opens an email. They work by embedding a unique URL in a page’s image tag, and monitoring incoming GET requests.

    Imagine doing that, but for file reads, database queries, process executions, patterns in log files, Bitcoin transactions or even Linkedin Profile views. Canarytokens does all this and more, letting you implant traps in your production systems rather than setting up separate honeypots.

    Canarytokens is available for free at http://canarytokens.org, or you can download and run your own installation (source and Docker images are available.)

    Why should you care?

    Network breaches happen. From mega-corps, to governments. From unsuspecting grandmas to well known security pros. This is (kinda) excusable. What isn’t excusable, is only finding out about it, months or years later.

    Canary tokens are a free, quick, painless way to help defenders discover they’ve been breached (by having attackers announce themselves.)

    How tokens works (in 3 short steps):

    1. Visit the site and get a free token (which could look like an URL or a hostname, depending on your selection.) 
    2. If an attacker ever uses the token somehow, we will give you an out of band (email or sms) notification that it’s been visited.
    3. As an added bonus, we give you a bunch of hints and tools that increase the likelihood of an attacker tripping on a canary token.

    More Details:

    Tokens consist of a unique identifier (which can be embedded in either HTTP URLs or in hostnames.) Whenever that URL is requested, or the hostname is resolved, we send a notification email to the address tied to the token. You can get one in seconds, using just your browser.
    To obtain a token:

    1. Visit http://canarytokens.org.
    2. Enter your email address. (It’s only used to notify you when the token is triggered, mails are not used for any other purpose.)
    3. Enter a comment which describes where you’re using the token. If the token is triggered in six months time, a comment will help you remember where you placed the token. Be specific (e.g. “file watch on” or “Password lure email in user@domain.com inbox”. We envisage having loads of tokens, so a good description is necessary. 
    4. Click “Generate Token” to obtain your token. 
    5. Copy the token and drop it somewhere it will be stumbled over.

    How do attackers trip over a token?

    Recall that a typical token is a unique URL and/or hostname. The URL component is pretty flexible. This means that if your token is:

    For example, you could send yourself an email with a link to the token plus some lure text:
    Simply keep it in your inbox unread since you know not to touch it. An attacker who has grabbed your mail-spool doesn’t. So if your emails are stolen, then an attacker reading them should be attracted to the mail and visit the link – and while your week is about to get worse, at least you know. 
    If you like, you could even use the same token as an embedded image. This way it works like the classic 1×1 transparent GIF. Now an attacker reading your inbox could trip over it just because his mail client renders remote images. (In this way you can use free Canarytokens as a classic web/mail-bug, to receive a notification when an email you send has been read.)

    Production Usage

    Canarytokens can be used as simple web-bugs, but they are incredibly flexible as we’ll see.

    You may have a fancy SIEM that lets you know when stuff happens, but you’ll find that with a little creativity, there’s a bunch of places that you could get wins from a token (that can be deployed in seconds) that you couldn’t easily get to otherwise.

    Do you trust the admins/support at DropBox to leave your files alone? (or Office365? or HipChat?)

    Simply generate a token and drop it in your folder, or mention it in your HipChat channel. If some admin is browsing contents in their spare time (or is being coerced to do so by a 3rd party) they will trip over your URL and you’ll be notified.

    Tokens + helper tools


    Every time someone gets owned, and their homedir gets published, theres a bit of speculation on “how they got taken.” While we may not always know the answer to that question, there is something we _do_ know. Files in their home directory were read. (This will include files that were never likely to be read by anyone, so this could be a really high quality marker that bad stuff has happened!)

    We include a no-dependency C program (Canaryfy) that will compile and run on Linux. Generate a token, then use it to watch a file. If the watched file is ever read.. you will get your notification..
    On OSX, without iNotify events, we make use of DTrace to get the same result.
    You could use DTrace to monitor binaries executing too, so XXX will take a token as input, and will notify you if someone runs uname, id, ifconfig or hostname on your machine.

    desktop.ini share + zip-files

    Windows provides an even cooler way to get notified, in the guise of the venerable old desktop.ini configuration file. Dropping a desktop.ini file in a folder allows Explorer to set a custom icon for a file. Since this icon can reside on a remote server (via a UNC path), using DNS we can effectively make use of a token as our iconfile.
    This means anytime someone browses the directory in Explorer,  a notification is sent! It’s an actual file tripwire without any agents or log file monitoring.
    (WinZIP and WinRAR both maintain directory structures and honour desktop.ini – you can download a Zip file with the desktop.ini already packaged after you generate your token, and you’ll get notified if someone opens (expands) the Zip file.


    Inserting Canary rows into a database, and then watching if they are ever accessed is a pretty common piece of advice when reading about database security. Interestingly, we will wager that most people who have given this advice, have never actually tried making this happen. Its surprisingly painful, and likely not possible in the version of the database you’re running!
    It isn’t natively possible to have MSSQL server trigger an action on a SELECT statement, but what one can do is create a custom VIEW which triggers a DNS query when a SELECT is run against the VIEW.
    (it’s also possible to set permissions on the VIEW so anyone can run a select on it without seeing its source).

    Then, if anyone queries the, say, the user_password view:

    The DNS lookup is triggered an a notification is sent:

    Since the DNS query is built in T-SQL, we have fine-grained control of the query. It means we can embed additional information like the querying user in the notification.
    On MySQL, we make use of another simple tool called canarytokend. This simple utility tails the MySQL log file, matches preset regexes and triggers alerts through the canarytokens console.
    Canarytokend is useful since its highly extensible; it simply tails log files and triggers tokens (MySQL is just the example log file). You can use it to watch any kind of log, and fire emails on matches.

    Document Open

    Honeydoc files are relatively well known. Simply placing a token in the document meta-data, give us a reliable ping when the document is opened. Canarytokens generates both a Word document and a PDF document.

    One trick: the PDF document will trigger a notification by Adobe Reader regardless of whether the user allows network communications!

    JS Page copied

    The Canarytoken server can also notify you if a web page you care about is copied (and hosted on another site). This is usually step0 in a well executed phishing campaign. To make this happen, we simply create our token from canarytokens.org, then:

    Imgur, Bitcoin and LinkedIn

    Imgur, LinkedIn and Bitcoin give us other channels for the Canarytokens server. We can use these sites as oracles to determine if they have been accessed, or touched.

    This isn’t new!

    Agreed, the basic concept is old. Lance Spitzner spoke about honeytokens in 2003 and Spafford & Kim mentioned the concept back in 1994
    In fact, map makers have been mixing fake data with real data for hundreds of years to catch map thieves.

    What Canarytokens does however, is makes this concept trivially useable by everyone, and implements a bunch of techniques and approaches which haven’t been publicly discussed.

    What if attackers blacklist the canarytokens.org domain? Doesn’t that work?

    This would work! That’s why we suggest that you download the canarytokens docker image and run your own server. (You can grab the source to build it yourself from here)

    Future announcements

    We will announce new channels, and new developments on Canarytokens through our @thinkstcanary twitter account.
  • Why control matters

    In March we moved from Groove to Zendesk – with this migration our Knowledge Base (KB) moved also.

    The challenge we faced was name-spacing – KB articles hosted on Groove were in the name-space  http://help.canary.tools/knowledge_base/topics/, but the namespace /knowledge* is reserved on Zendesk and is not available for our use. This forced us to migrate all KB pages to new URLs and update the cross-references between articles.  This addressed the user experience when one lands at our KB portal  by clicking a valid URL or when typing https://help.canary.tools in a browser.

    What isn’t resolved though, is thousands of Canaries in the field that have URLs now pointing to the old namespace. We design Canary to be dead simple, but realise that users may sometimes look for assistance. To this end, the devices will often offer simple “What is this?” links in the interface that will lead a user to a discussion on the feature.

    With the move (and with Zendesk stealing the namespace), a customer who clicked on one of those links would get an amateurish, uninformative white screen saying “Not Found”.

    This is a terrible customer experience!

    The obvious  way forward is a redirect mechanism which maps https://help.canary.tools/knowledge_base/* URLs to the Zendesk name-space. By implication – the DNS entry help.canary.tools cannot point directly to Zendesk; it needs to point to a system that is less opaque to us so can we configure it at will.

    That’s straight-forward! On a fuzzy-match we should have something up and running in minutes with AWS CloudFront . This allows us to map name-spaces from https://help.canary.tools/* to https://thinkst.zendesk.com/* with minimal effort.


    Step 1:
    URL: https://help.canary.tools/some/uri
    GET /some/uri HTTP/1.1
    Step 2:
    URL: https://thinkst.zendesk.com/hc/en-gb/some/uri
    GET /hc/en-gb/some/uri HTTP1.1

    The next step is to intercept request to the /knowledge_base name-space, and return an HTTP/301 redirect to the correct URL in  Zendesk.  We make use of the Lambda@Edge functionality to implement a request handler.

    download (1)

    30 minutes later and few lines of Python it seemed like we had it all figured out but for one not-so-minor detail – images in KB articles weren’t loading. What is going on?

    The DNS record help.canary.tools was pointing to CloudFront while the origin for the content was configured as thinkst.zendesk.com, so when CloudFront requested an image it got HTTP redirect back to itself causing an Infinite redirect loop.

    Surely this is fixable by adding the correct Host: canary.tools header to the request ? Nope! Instead of a redirect, now we were getting a 403 from CloudFlare (N.B NOT CloudFront) – Zendesk uses CloudFlare for its own content delivery. WTF?!?

    After a few iterations the “magic” (read: plain obvious) incantation was discovered. Note is the IP address behind thinkst.zendesk.com

    This is somewhat expected, since Zendesk is configured to think it’s serving requests for help.canary.tools.

    Without this option Zendesk rewrites all relative URIs in the KB to yet another name-space: https://thinkst.zendesk.com/* which brought its own set of challenges, complexity and non-deterministic behavior.

    To avoid confusion and further issues down the line we imposed a design constraint on ourselves – a simplifying assumption: the browser’s address bar should only ever display help.canary.tools – the  thinkst.zendesk.com name-space should never leak to customers.

    Committed to this approach, the next hurdle we faced was Server Name Indication (SNI).

    Server Name Indication (SNI) is an extension to the Transport Layer Security (TLS) computer networking protocol by which a client indicates which hostname it is attempting to connect to at the start of the handshaking process. This allows a server to present multiple certificates on the same IP address and TCP port number and hence allows multiple secure (HTTPS) websites (or any other service over TLS) to be served by the same IP address without requiring all those sites to use the same certificate.

    CloudFront was doing exactly what it was configured to do. It connected to (and negotiated SNI for) thinkst.zendesk.com which resulted in a 403 error because Zendesk is configured for SNI help.canary.tools.

    For any of this to work, what CloudFront needed to do was connect to  thinkst.zendesk.com ( ), but negotiate SNI for help.canary.tools. By any other name – we needed “SNI spoofing” (not really a thing – I just coined the phrase).

    Can CloudFront do that? No, it can’t :_(   And just like that we had to rethink our approach – CloudFront was not the solution.

    Another failed approach was setting the Host mapping field in Zendesk to kb.canary.tools, and it may have worked but for a bug in Zendesk which fails to re-generate SSL certificates when the Host mapping field is updated in their admin console, so browsing to https://kb.canary.tools was met with certificate validation errors. How long does it take for Zendesk to rotate certificates? We don’t know (but it’s more than 30 minutes).
    There were just too many moving parts in too many systems to allow us sanely and consistently reason about customer experience.

    Retrospectively, the root-cause of all our problems was still related to name-spacing: both CloudFront and Zendesk (rightfully) believed they are authoritative for the hostname help.canary.tools.

    • From the perspective of the entire  Internet help.canary.tools points to CloudFront.
    • From the perspective of CloudFront – help.canary.tools points to Zendesk.
    So if both systems share the same name in public, how do the two systems address each other in private?
    The answer was some form of Split-Horizon DNS. The least sophisticated version would’ve been to simply hack the /etc/hosts file on the host serving requests for help.canary.tools, but this functionality exists natively in Nginx’s upstream{} block. Of course, those IP addresses could change, but this is a manageable risk that can be remediated in minutes. In contrast round-trip times on tickets to Zendesk are measured in days.
    The proxy_ssl_server_name option enables SNI., and the kb_uri variable uses http_map_mod for performing lookups/rewrites on URLs in https://help.canary.tools/knowledge_base/* name-space.
    In the end, the Nginx configuration necessary to address our needs was as simple as this:
    map $request_uri $kb_uri {
         default “”;
         /knowledge_base/topics/url1 /hc/en-gb/articles/360002426777
    upstream help.canary.tools {
        # dig help.canary.tools
        listen 443 ssl;
        server_name help.canary.tools;
        location / {
            proxy_pass https://help.canary.tools/;
            proxy_ssl_server_name on;
        location /knowledge_base/ {
            if ($kb_uri != ""){
                return 301 https://help.canary.tools$kb_uri;
            return 302 https://help.canary.tools;

    Where are we now?

    https://help.canary.tools is now Nginx running on EC2.  It’s all Dockerized and Terraformed so the configuration and deployment is reproducible in minutes.

    Nginx SSL certificate renewals and refreshes are automated using CertBot (thanks to this guide). Down the line we can add mod-security giving us a level of visibility into potential attacks against – this level of visibility is unfathomable even if CloudWatch was a viable solution.

    Using Docker’s native support for AWS CloudWatch all Nginx access logs land up in CloudWatch which  gives us dashboarding, metrics and alarming for free.

    We now get alerted every time a customer attempts to access a missing URL in the https://help.canary.tools/knowledge_base/*. Mean while, the customer doesn’t get an ugly “Not found” error message – they are redirected to our Knowledge Base home page where they can simply use the search function. This has already paid dividends in helping us rectify missing mappings.

    From CloudWatch we can directly drill down into Nginx access logs to examine any anomalous behavior.

    This is a stark contrast from the world where the application layer was opaque to us – bad user experiences and broken links would have gone completely unnoticed.

    Control matters. This is why.

Site Footer

Authored with 💚 by Thinkst