The second episode of Xconomy’s new podcast, Xconomy Voices, features Recorded Future co-founder and CEO Christopher Ahlberg. His Somerville, MA-based cybersecurity company monitors both the public, visible Web and the Internet’s darker corners for “threat intelligence” that can help its clients prepare for, and fend off, cyber attacks.
Ahlberg’s background in data analytics and his perspective on cyber threats make him a compelling figure in the national cybersecurity discussion. Here’s the full transcript of our interview, which took place a few months ago. (This Q&A includes a lot of material not found in the podcast.)
Xconomy: Can I get you to start by telling us a little bit about where you come from and what you’re up to now—your background and how you came to Recorded Future. And then we can talk about the company itself.
Christopher Ahlberg: I’m the co-founder and CEO of Recorded Future. I have a background in analytics and data visualization. Originally I grew up in Sweden. I came to the U.S. 20 years ago. I started a company called Spotfire here in the Boston area that we built and sold eventually to a company out on the West Coast, and then started Recorded Future together with two guys. I’m a co-founder. We started that back in early 2010 and have been sort of off and running since then.
Xconomy: What is the central idea at Recorded Future? If you had to say what sets you apart from other companies in the cybersecurity or threat detection sphere, what’s your key idea?
CA: Cybersecurity has become an enormous marketplace in itself. It’s probably a $100 billion market, or soon to be, if you add it all up. But you’re talking about a lot of stuff that adds up to that $100 billion. We play in the market that people would call threat intelligence, probably a billion and a half of the market.
We are basically centered on the idea of being able to pick up the threats before they hit your doorstep. What happens outside your company. We try to detect bad actors before they come to you, before they attack you. We try to understand their intentions, their capabilities, what they’re up to, what they’re doing tomorrow, what they’re doing next week. But we also try to find out who is being targeted.
I would say that the best indication of who’s going to begin breaking into your house is that your neighbor has had a break-in and you have the same lock, or the guy with the same flower delivery service or the same cleaning service, and so on and so forth. So we help people understand what happens outside their firewall, basically, and do that in a way that is very actionable and very impactful to their company.
Xconomy: And what’s the basic underlying technology? What would you say are the tools you’re applying to that problem?
CA: We start off using the Web as one large sensor. Historically, when you had intelligence agents running around on the ground, you might listen into phone calls or have satellites in the sky. You actually might read newspapers, open source intelligence. There are lots of ways to do intel. We started off with this idea that the world’s information was quickly flowing to the Internet and the Web. And so we started off harvesting the Web at a large scale and doing that not just in English but in basically all the languages that bad news happens in: Chinese, Russian, Farsi, Arabic, and Spanish, French.
So it’s sort of now built up to basically covering some 30, 40 different languages in total. But then also getting into the underground of the Web—people like to call it the Dark Web—as well as into the technical areas of the Web and really being able to pull together from open sources, Dark Web sources, as well as technical sources all into one place where we connect the dots to allow us to find the most imminent threats and help them be very actionable for a company or for an organization.
Xconomy: What are your equivalent of the intelligence agents running around? I’m assuming you’re using software, right, so we’re talking about machine learning?
CA: Machine learning plays a role in all of this, absolutely. But our equivalent of the people running around…that’s a good question, actually, because it turns out that our competitors in this space, they basically all are dependent on humans collecting information, which can be all nice and fine. But the problem is it’s inherently very unscalable. We run hundreds and up towards thousands of servers in the cloud that collect information from all those types of sources I talked about before. Open sources, Dark Web sources, as well as technical sources. And then pre-aggregate and aggregate that information into consumable information or artifacts, whether it’s data visualizations or alerts or reports or API feeds for computers.
Xconomy: And then what are you doing on those sources? Can you say a little bit about the techniques you’re applying, the computer science approaches to analyzing that information, turning into something actionable?
CA: So you can imagine that we pick up, in a Chinese newspaper, where China states their capabilities for their new offensive cyber capabilities that they’re building. Or we’re in a Russian forum and a series of Dark Web actors discuss how to commit fraud to retail business in the U.S. Or an Iranian forum where a bunch of hackers talk about how to [gain] entrance into a facility of some sort.
From there, what we do is natural language processing. So, being able to look at language where it might say, Actor A says to Actor B, “I’m developing this piece of malware that has this capability,” and so on and so forth. We’re able to take that natural language and decompose that into data points, which can range everything from threat actors to their capabilities, their intents, as well as the technical data that goes with it. In other cases we pick up more technical information and organize that. Part of the trick here is being able to marry the data coming from narrative text with the data that comes from more technical sources, and put that together in a place where it can be consumed by either human or machine.
Xconomy: And then I guess you supply this intelligence to your clients in a form that lets them do something with it. So can you give an example of how this information might be made useful if you become aware of a threat that may or may not materialize. What do you do next?
CA: Assume you’re a large financial institution somewhere here on the East Coast. Lots of different things can happen, and you might find out that your neighbor bank was hacked. You want to know about that. You want to know about it immediately. Because many times the bad guys…will come after you. Not necessarily because they wake up in the morning and say, “I want to go hack Bank X,” but because they found a way in. And once they’ve found a way in, they’re going to go from Target 1 to Target 2 to Target 3 or 4.
So you’re a target of convenience and you just want to make sure that when you see Target 1 being hacked, and if you’re Target 3, you want to patch your systems based on what you learned from that hack and Target 1. So keeping a high degree of visibility to what’s going on in the threat landscape is highly important. That’s number one.
Number two, you might find out, whether it’s in a forum or similar, that the bad guys talk about what softer vulnerabilities that they’re going to attack. So you’re going to see somebody saying, “Look, I’m building a piece of malware that takes an exploit that takes advantage of this vulnerability,” and they might actually use the same sort of descriptor for the vulnerability that the U.S. government publishes. And then they’ll talk about that and they might actually publish an exploit kit on that and sell that to a fellow bad guy. So again, if you can gain that sort of intelligence, that will tell you what systems you need to patch on your site to get out ahead of any threat like that.
It could be that you find leaked credentials—usernames and passwords that have been stolen from a company. And on and on. There are many of these sorts of scenarios. In fact we probably count some 60 or 70 different core scenarios of this kind that we configure into our product, to help customers be prepared for threats and be able to take action on the scene as soon as they see them.
Xconomy: Would you say that the kind of threat intelligence you guys provide is a necessary complement to a more defensive internal posture? Lots of companies have their own Internet and Intranet security software and you’re not saying that they can do without that. You’re just saying that unless they understand the threat environment, they’re only halfway prepared.
CA: Yeah, exactly. Historically people have invested primarily in defenses of their firewall, and that’s where that other $80 billion has been going, to try to firewall their perimeters, and they’ve been building what people have thought about as higher and higher walls… thicker and thicker walls. The problem is that if you think about this in reality, people haven’t been building thicker or higher walls, they’ve really just been building mazes.
And those mazes are getting more and more complicated. And the interesting dilemma that we end up here with is that we have high degree of personnel turnover in security departments. It’s one of the highest-turnover sort of places. We need to know all the possible entry points and be able to try to do something about those. The bad guy, he just needs to know one password. So what we have really [is] just sort of a superficial wall, and you need to supplement that with other approaches, threat intelligence being one of the few ways that you actually can supplement that.
The other analogy is, think war. If you were defending a castle or a moat or a bridge or whatever, you’re going to defend it. Whether you were in Roman times or these times, you’re not going to just sit there and wait it out. You’re going to send out people to look, before the bad guy shows up. And there are many ways to send that guy out. And we’d like to think we’re one of those ways. To fight this war without intelligence is really just stupid.
Xconomy: I’d like to hear a bit about the history of Recorded Future. My understanding is that when you guys started out in 2009 you were using these sources on the open Web to try and make predictions about the near future, based on chatter or intel from the open Web from maybe hotspots around the world. What was the original idea, and what does it take, or what did it take then to be able to make a reliable prediction about a world event?
CA: The company got started in more general intelligence. It’s always had a very strong focus on intelligence, and we still have a strong foundation in that. And even in our business there’s sort of a strong foundation in intelligence. And then over time what we’ve done is just sort of build more and more of the business to focus on cybersecurity, because that’s become such an enormous opportunity. And we see the next decade here of security intel being a fantastic place to be, in terms of being able to do use this data in a predictive manner.
It all depends on what you’re trying to predict, and it turns out that the general prediction business is not a very good business, and it’s not even very interesting from an intellectual point of view. Maybe if you’re highly academic, it’s interesting. But we realized that to build a real interesting business you have to attack a business case, and we focused on cybersecurity. It’s more interesting to think about how to answer meaningful questions for customers, and customers come to us and say, “Look, I’m interested in what do I need to worry about next week.” So, predictions are sort of all about what problem you’re trying to solve here, and we try to stay away from the general predictions.
Xconomy: What is the Dark Web, exactly?
CA: There are three levels to the Web, and three levels to where we collect intelligence. One is the surface Web. That’s what we all use, whether it’s a a U.S. newspaper or a European blog or social media, whatever. Some of it is easy to get to, some of it is harder. The harder parts might be in different languages. But all of it is fairly easy because I can get to it on my browser.
Then there are places that are called Dark Web that’s in the Tor domain, or the Onion domain as people might call it, where the content is encrypted and it’s set up in a way that even it’s very hard to read the data.
And then finally the stuff that it’s really hard to get to might require registration down there. As you dig in you’re going to find areas of the Web where you not only have to register but you may need to have another guy who will vouch for you to get in there. It can get pretty hairy.
Then there is the technical Web, where you think about IP addresses and domain names and file caches and malware.
And the most interesting point about that is that people tend to sort of read some Wired article from 10 years ago saying that the Dark Web is enormous and it is much bigger than the open Web, and that’s just not true at all. It’s actually pretty damn small. But there are some portions of it that are very interesting for the area of cybersecurity, if nothing else.
Xconomy: As a startup, I imagine that you see yourselves as a disruptive organization, helping people reinvent or rethink certain kinds of security. But I wonder who you would say you’re disrupting? Whose business are you taking away? What dollars are flowing to you guys that might be flowing to someone else otherwise? Or is it a green field?
CA: People have always done intelligence just sort of naturally, when you open up the newspaper in the morning. These days you might open up a Web browser. So we’re always trying to figure out, how can we make that work more effective for people. So you could say we are disrupting the element of the security market that has historically been served through intelligence providers who are doing things manually and might have been writing and sending customers written reports of various sorts. And so we’re clearly disrupting that. But most of this is a green field. Threat intelligence is a new area, and we can sort of come in and help customers get into this. The good news is that the acceptance for this is happening very, very fast, and people are defining the market. There are directors of threat intelligence, and they have budget to buy this. And it’s a lot of fun.
Xconomy: You guys have a blog where you sometimes share some of the things you’re discovering about what’s going on on the Dark Web, and in recent months you’ve published a couple of stories about this Russian hacker whom you’ve given the name Rasputin. And I wondered if you could tell that story, because I think it might be a good illustration of the kind of things that you guys do. And also could you explain why it’s important to you to share that kind of story on your public blog?
CA: Number one, we’ve always been pretty passionate about trying to share analytics stories in our blog, because it’s sort of important to put life into this sort of stuff. In the early days of the company there were a lot of conspiracy theories about us and what we did and what we didn’t do, and by making that concrete we hope that we helped that and sort of showed good examples of how the product was used. We continue to do that to this day.
And then number two, we want to share and give back to the security community. And I think that’s sort of the cool thing. The security community—in many ways it is a community, so everybody tries to share and publish stories. Partially because they obviously want to be thought leaders, so they’re self-serving. But also it truly is a community. We want to give back.
Specifically on this Rasputin story, we found somebody after the election, interestingly enough, who was selling access to a particular government agency called the Election Assistance Commission, whose job it is to help organize, help build, sort of put in place systems for running elections and helping states with this. It’s this fairly small agency, but they have a very particular mission there. And it could make it an interesting target for some. It’s not a good target if you wanted to come in and fiddle with voting systems or any actual vote tallying. But it can be a very interesting target if you want to affect and influence systems. It would be something an anonymous hacker could not pull off, because it’s been too hard. But a government actor could certainly do something very interesting with this.
So in mid-November we detected this guy that we put the name Rasputin on. He was selling access to this. And we detected that. We took it off the street. We bought the exploit, which was somewhat edgy in the way that we did it. We shared with the relevant government authorities in a nice orderly fashion, and then worked for them, primarily. They worked on it and we supported them as much as we could over a series of weeks, and then eventually we published this, because we thought it was an important story that needed to be told about this actor and it drove a fair amount of attention.
Xconomy: You said you took this exploit off the street. Are you actually taking it off the street when you buy it? How do you know that it’s not still out there waiting to be sold to the next customer who comes along?
CA: The simple answer is, you don’t. But these places are marketplaces. A couple of things matter. Money certainly is one. The other one is reputation. Most of these actors will build their credibility over time. And they might start off by selling something small, and they’ll build up to selling something bigger and with more money later on. And if they try to mess around with somebody, and they sell something to somebody and then two days later they come back and sell it again, that’s not going to help their reputation. And the reputation is one of the few things they have to sort of trade with or offer up down there. So it’s a very weird place. That’s why you’ve got to know what you’re doing.
Xconomy: Just for the sake of clarity, the attack or the infiltration you guys detected had nothing to do with the earlier stories last year about the DNC servers being hacked by by Russian actors. This is a separate story from all that, right?
CA: Absolutely, completely separate. One of the interesting things about what we have come to call the DNC hack is that actually, that actor, who presumably was what we call Apt28 and Apt29, did not attack any U.S. government systems. And I believe that that was quite deliberate, to make sure that they got an effect on U.S. election but without touching any U.S. government systems. Because now there is no line in the sand here, but there certainly has been talk about a line in the sand saying that if you affect critical infrastructure, that would be the equivalent of going to war. And these guys were clever. They accomplished their goal without touching any U.S. infrastructure or any U.S. government infrastructure.
Xconomy: With the DNC hacks and Rasputin and other stories surfacing, it seems like there is an increasing velocity of cyber-espionage, cyber warfare, and other attacks going on. And I’m wondering whether, objectively, that’s true, or whether there’s simply more coverage and more awareness than there used to be. Is there any way to gauge that, from your perspective? Is there more hacking going on, or is it just that the public is learning about those stories more often?
CA: There’s is a little bit of a perfect storm, absolutely. There is more media coverage. But yes, there is more hacking, absolutely. But maybe more interestingly, we’re seeing a different sort of hacking than what we’ve seen before. I’m using hacking in a very liberal sort of way. I think it’s two things here. We used to see a lot of people stealing credit card information or stealing credentials and that’s sort of come to the point where when we see now yet another Yahoo hack, somebody steals 500 million credentials, people go like, “Oh, OK, so what.”
Now, what we saw last year in 2016, and are presumably we’re going to see more of, is three things. One is the attack on elections and the political infrastructure. And I use the words political infrastructure maybe more than just the election infrastructure, because it’s unnecessary to attack election infrastructure. It’s better to [attack] the political infrastructure. We’ve seen that happening in France now. It happened in the U.S. and we’re going to see it elsewhere. So that’s one.
Number two, the idea of attacking the Internet in itself. The Mirai botnet in the fall where somebody attacked the Dyn servers up in New Hampshire, and which obviously had enormous impact on the Internet for a couple of days. That’s scary in terms of securing the Internet in itself.
And then, three, systems that we never thought were hackable at all being attacked. The guys who got away with $89 million from Swift was a huge deal in my mind, in the sense that they attacked something that we thought was air-gapped: the Swift network, the money transfer network. These guys had the intention of stealing a billion dollars, and that’s a very different proposition than stealing tens of thousands or hundreds of thousands of dollars’ worth of credit card information off credit cards. But when you start stealing a billion dollars, now we’re talking a different league here. And so I think with these three examples, it’s less about the volume. Those three examples took it to a different level here.
Xconomy: So I imagine there is increasing appetite or awareness or demand for services like yours. Can you give me some vital statistics about the company right now? How big have you gotten, how much money have you raised, how many offices do you have? Help people gauge the size of Recorded Future.
CA: As with any other private company, we’re trying to be coy about that stuff. But the stuff that you can figure out without too much Googling around is that we’ve raised about $30 million in total. Actually it’s been a good while since we raised money. Hopefully that’s a good thing. You could probably dig around on LinkedIn and figure out that we’re some 120 people or so in total. And so that’s probably not the best kept secret. We’re in four or five different offices. You’ll find us in Boston and Washington, DC, in Gothenburg in Sweden, as well as in London, U.K. We’ve got a little bit less than 25,000 people who receive our Cyber Daily every day. We think that’s very cool. That’s probably the largest intel brief that goes out in the world. That’s larger than some of the big military briefs. At least, it’s the biggest civilian intel brief that goes out there. And then [we have] hundreds of customers, and that’s probably the only thing we’ll will say about that.
Xconomy: What do you guys do to keep getting better at what you do? Do you have an R&D division, or are you constantly reinventing you or your software and your infrastructure?
CA: In intel, in all software, you have to keep improving. Cyber is sort of unique in the sense that you have a sentient opponent who is literally trying to fiddle with the data that you’re trying to measure. And that happens in a few other sort of places, maybe in trading bots and Wall Street, there are a few places this happens. So that means—and this is true in intel in general—as you collect intel data, your adversary may literally try to change what you’re measuring. He might change his targeting. He might change the methods that he uses. He might change lots of different things. If you sit still, not much interesting is going to happen. And if nothing else your competitors are going to come up and catch up. So you have about nine different reasons to keep inventing very fast here.
And so yes, we try to do a lot of different things where we run a core collection apparatus here just like any intel agency would. And we keep investing and improving that. That’s sort of the number one.
Number two then, collecting new types of data. We do that all the time. If you think about it in military terms, people think about all source analysis, trying to get lots of different angles on the same problem set. So adding more types of data to what we do, trying to figure out how we can do what I call pre-connecting the dots for our customers, because that’s the hard part. So we probably have some 30 billion dots, 30 billion rows in our database. And that’s a lot, when you’re going to sit down and say who actually did Attack A or who was targeted in Attack B. When you start with 30 billion of anything, it’s really hard. So we try to connect those dots.
And now we come to what you talked about before, machine learning and other techniques to try to actually help people make sense of this data. But we’re also firm believers in enabling the human in all of this. There is no easy button here where you just push a button and out comes an answer. But we like to think about enabling the humans, the analysts. And we talked a lot about creating cyber threat, intel “centaurs” who are enabled through our machines, our systems, to be smarter analysts.
Xconomy: Are there any misconceptions about cybersecurity that particularly annoy you?
CA: So, two. One on the target side and one on the attacker side.
On the target side, I think everybody would love to think that they’re a target of choice, that they’re very special, and that somewhere there is somebody sitting in a dark cave and trying to figure out how to attack you. In reality it’s bots. These scanner systems are going up and down, up and down the Internet and looking for something that’s vulnerable. And as soon as they find something, they go to work. People like to think that they’re a target of choice. In reality they’re probably a target of convenience.
On the other hand, in terms of the attackers, people like to think that attackers are super hyper advanced. And yes there are a few places that are super hyper advanced. But even the super advanced sort of people or organizations are unlikely to apply the most advanced methods in there. The most advanced methods, you’re unlikely to want to burn. You will actually keep them for the time when you really, really need them . Regardless of who did Stuxnet—we could argue about that all day—but in that attack. some pretty juicy stuff was used. But they’re burdened with it because the code is out there then; it was there for everybody. Not only was [it] sort of showing the hand of what’s possible; somebody showed in a very clever way that you could jump into an air-gapped facility. That had not been done before.
But an attacker is more likely to find some more basic methods and just try to reuse them and reuse them and reuse them, in a more sort of mechanical fashion, and to attack those targets of convenience.
Xconomy: To wrap up, what’s the most fun thing about your job, and what’s the most challenging thing about your job?
CA: If you think about it, there are very few places where, in a commercial environment, you get to walk in the door in the morning and come in and chase bad guys, criminals, nation states, what have you. And try to outsmart them and try to figure out how we can make the Internet a safer place. In terms of what we do for threat intel, I think we’re pretty damned unique, and there are very few places where you get to walk through the door and have a database of 30 billion records and try to figure out how to grow that in clever ways and how to use that data to figure out the intents and capabilities of the bad guys. It’s pretty unique and pretty fun.
Xconomy: But what gives you a headache by the end of the day? By the time you’re walking out that door you probably are feeling differently, right?
CA: It’s probably exactly what I just said—the same things. It’s very, very, very hard because the bad guy is sitting on the other end, and is not sitting still. But I think if you’re going to be an entrepreneur, you can’t worry about headaches. There are going to be headaches every day all the time. And headaches are to be got rid of.