Monday, July 25, 2016

This Old Vulnerability #2: NetBSD and OpenBSD kernfs Kernel Memory Disclosure of 2005


Time is an Illusion

[Editor's Note: This is part one of a two part post, the second of which is Vineetha Paruchuri's guest co-post, which can be found: here]

It makes sense to me that physicists have been arguing against time as a physical construct for years now, because as humans we have a clear penchant for ignoring time altogether. More precisely, we seem to ignore history as if it never happened. And, when we do recall historical events, we somehow do so erroneously. This isn't just true in the world of politics or law, it's true in every facet of society. Tech, and sometimes especially tech, is no outlier. 

In 2005, I was bored, making silly bets with friends on IRC about how fast we could find exploitable bugs in "secure" operating systems. This was pretty common for us, as young hackers spend the majority of their time reading source code. A good friend pointed out that the increased scrutiny on the BSD variants was decreasing the number of exploitable integer overflow attacks on kernels. I argued that this was probably false, and that there were lots of bugs yet to be found. 

What's interesting is that this bug class is still prevalent today. In fact, it may be the most underreported bug class in the history of computing. In 2014, when I released the LZO and LZ4 memory corruption bugs, they are of the exact same class of exploitable integer issues. Because of pointer arithmetic, and how CPUs manage the indexing of memory, they are extremely difficult to find and remediate. The difficulty of this bug class caused the LZO vulnerability to persist in the wild for over 20 years, and allowed variants of LZO, such as LZ4, to be created with the exact same vulnerability

Finding the Bug

Back to my friends and I on IRC, we made a bet: Find an exploitable kernel vulnerability affecting any BSD variant within an hour. The winner gets bragging rights. I almost lost, having found the bug in literally 57 minutes and some seconds. 

The bug? An integer truncation flaw in the NetBSD and OpenBSD kernfs pseudo-filesystem. This file system provides access to kernel abstractions that the user can read to identify the state of the running kernel. In Linux terms, these abstractions would all be handled by procfs. On BSD, procfs was (is?) a pseudo-filesystem providing insight into only active processes, themselves. On Linux, procfs provides access to kernel objects ranging from the CPU, to VMM, processes, and even network abstractions. 

The flaw was discovered by trolling through NetBSD patches. In fact, I discovered the bug by identifying a patch for a similar integer problem committed days earlier, simply by chance. Because I constantly monitored the patches for all BSDs, it was easy to troll through the patches identifying ones may be valuable. An interesting commit tag caught my eye:

Revision 1.112 / (download) - annotate - [select for diffs]Thu Sep 1 06:25:26 2005 UTC (10 years, 10 months ago) by christos
Branch: MAIN 
CVS Tags: yamt-vop-base3yamt-vop-base2yamt-vop-basethorpej-vnode-attr-basethorpej-vnode-attr 
Branch point for: yamt-vop 
Changes since 1.111: +6 -6 lines
Diff to previous 1.111 (colored)

Also protect the ipsec ioctls from negative offsets to prevent panics
in m_copydata(). Pointed out by Karl Janmar. Move the negative offset
check from kernfs_xread() to kernfs_read().

As depicted above, the patch applied at revision 1.112 purports to resolve multiple integer related bugs from being triggered in the kernfs_xread function. It does so by moving the check for all valid read offsets to kernfs_read. One might think, at this point, that this is a solved problem. Presumably all bugs in the former function can be resolved by placing the check in the latter, parent function. 

However, there is an easy to spot problem in the patch. Consider the following code:

int
kernfs_read(v)
 void *v;
{
 struct vop_read_args /* {
  struct vnode *a_vp;
  struct uio *a_uio;
  int  a_ioflag;
  struct ucred *a_cred;
 } */ *ap = v;
 struct uio *uio = ap->a_uio;
 struct kernfs_node *kfs = VTOKERN(ap->a_vp);
 char strbuf[KSTRING], *bf;
 off_t off;
 size_t len;
 int error;

 if (ap->a_vp->v_type == VDIR)
  return (EOPNOTSUPP);

 /* Don't allow negative offsets */
 if (uio->uio_offset < 0)
  return EINVAL;

 off = uio->uio_offset;
 bf = strbuf;
 if ((error = kernfs_xread(kfs, off, &bf, sizeof(strbuf), &len)) == 0)
  error = uiomove(bf, len, uio);
 return (error);
}

Initially, this looks appropriate. The function now checks to see if the file descriptor associated with a kernfs file has a negative read offset. If a negative offset is identified, the function returns with an error. Otherwise, the offset is passed to kernfs_xread and presumed safe for all operations within that function. 

This should be fine, except for the function kernfs_xread, itself. Here is the definition of the function:

static int
kernfs_xread(kfs, off, bufp, len, wrlen)
 struct kernfs_node *kfs;
 int off;
 char **bufp;
 size_t len;
 size_t *wrlen;
{

In BSD variants, the off_t type is always a signed 64bit integer to accommodate for large files on modern file systems, regardless of whether the underlying architecture is 32bit or 64bit. The problem arises when the 64bit signed integer is checked for its sign bit, then passed to the kernfs_xread function. Passing the off_t to the function truncates the value to a 32bit signed integer. This means that the check for a negative 64bit integer is invalid. An adversary only need to set bit 31 of the 64bit offset to ensure that the value passed to kernfs_xread is negative. 

The result of this integer truncation bug can be observed at the end of kernfs_xread. At the end of this function, we have the following code, regardless of which type of kernfs pseudo-file is being read:

 len = strlen(*bufp);
 if (len <= off)
  *wrlen = 0;
 else {
  *bufp += off;
  *wrlen = len - off;
 }
 return (0);
}

This code ensures that the size of the data copied back to userland is very large, and that the pointer to the data being copied will point outside the valid memory buffer for the given file. What's really great about this bug is that both kernel stack and kernel heap can be referenced, depending on which kernfs file is being read while triggering the bug. 

This allows an attacker to page through heap memory, which may contain the contents of privileged files, binaries, or even security tokens such as SSH private keys. Paging through stack memory is less immediately valuable, but allows an attacker to disclose other tokens (such as kernel stack addresses) that may be relevant to subsequent attacks. 

Patching the Bug

Though this vulnerability affected both NetBSD and OpenBSD, OpenBSD claimed that "it isn't a vulnerability" because they previously removed the kernfs filesystem from the default OpenBSD kernel. However, it was still build-able in the OpenBSD tree at the time, meaning that it was indeed a vulnerability in their source tree. It just wasn't a vulnerability by default. This was yet another misstep in a long standing career of misdirection by the core OpenBSD team. The NetBSD team reacted quickly, as kernfs was not only still integrated into the default kernel, it was mounted by default, allowing any unprivileged user access to abuse this bug. 

I sold this vulnerability to Ejovi Nuwere's security consulting firm, who ethically acquired software flaws in order to help promote their consulting practice. Tim Newsham reviewed the flaw and agreed that it was an interesting finding. Ejovi's team managed the relationship during patching and helped develop the resolution with the NetBSD team, who was quick to patch the bug. I was impressed with Ejovi's professionalism, and also appreciated the NetBSD team's fast work, and the fact that they didn't whine about the bug in the way OpenBSD did. 

The patch fixed the bug by performing the check on the truncated integer rather than the signed 64bit offset. 

@@ -922,18 +922,18 @@ kernfs_read(v)
  struct uio *uio = ap->a_uio;
  struct kernfs_node *kfs = VTOKERN(ap->a_vp);
  char strbuf[KSTRING], *bf;
- off_t off;
+ int off;
  size_t len;
  int error;
 
  if (ap->a_vp->v_type == VDIR)
   return (EOPNOTSUPP);
 
+ off = (int)uio->uio_offset;
  /* Don't allow negative offsets */
- if (uio->uio_offset < 0)
+ if (off < 0)
   return EINVAL;
 
- off = uio->uio_offset;
  bf = strbuf;
  if ((error = kernfs_xread(kfs, off, &bf, sizeof(strbuf), &len)) == 0)
   error = uiomove(bf, len, uio);


Breaking the Historical Cycle

While we considered the patch adequate at the time, we were wrong. The reason for this is based on the logic from the first This Old Vulnerability blog post: an integer doesn't need to be negative to create a negative offset or an over/underflow when applied to an arbitrary pointer in kernel memory. This is because the value of any given pointer does not start at address zero. This is a presumption often made in systems engineering. 

Tests presume a base address of zero, rather than the pointer's actual address, plus the offset into the pointer. If a 32bit pointer address points to 0xb0000000UL, an integer overflow will occur with an offset far less than would be required to set a sign bit. If this pointer address and a sufficient offset value are used in an inadequate expression, it may seem that the test would pass. Consider the following pseudo-example:

uint32_t * p = 0xb0000000UL;
uint32_t off = 0x60000000UL;
uint32_t * max_p = 0xb0008000UL;
if(off < 0 || p + off >= max_p)
        return EINVAL;


Some compilers will actually compile out the above code as it would be impossible to properly evaluate. But, if engineers don't notice this, or if there is no warning message printed by the compiler, or if an IDE is being used that doesn't adequately highlight the warning messages, this can result in critical flaws in software.

Testing this properly requires policy that evaluates both the base of the pointer and a ceiling for the pointer given the context of its usage. If a pointer points to a structure of a particular size, any expression that results in an address must be verified to land within that structure. This can be done by performing the operation, storing the result in the appropriate type, then evaluating the address as being within the structure in memory. 

As noted in the previous blog post, this requires organizational coding standards that enforce policies on how pointers expressions are evaluated and how they are tested. It also requires an evaluation of the context of each pointer. 

As always, these improvements are challenging to implement because they aren't simply a coding construct. This is an organizational problem that must be addressed at the management level along with each individual engineer's coding practices. Peer reviews must be accentuated with policies that guide auditing practices, and guarantee a higher level of success in catching and fixing these issues. For help, consider hiring Lab Mouse Security to assist with your internal code audits, and break the seemingly eternal cycle of exploitable integer vulnerabilities!


An Introduction

For those that don't know her, Vineetha Paruchuri is a brilliant up-and-coming information security researcher. She and I have been discussing the effects of security flaws that have persisted over decades, why langsec addresses some of the remediation/mitigation potential, but what gaps are still missing. 

This resulted in a guest post where Vineetha evaluates modern active models for the reduction of security flaws, rather than retrospective models which include code reviews, bug reports, etc. I highly suggest reading her guest blog as a co-piece to this one, and a primer for anyone interested in the modern movement to active, rather than passive, vulnerability reduction models. 

Don A. Bailey
Founder and CEO

This Old Vulnerability: Guest Post: Vineetha Paruchuri on Modeling How Vulnerability is Created, Rather than Remediated

[Editor's Note: Vineetha's guest blog is a companion piece to the Lab Mouse post found here]

It all started on Twitter when I called Bailey out on his crappy taste in music (naturally, he vehemently disagrees with the “crappy” part). [Editor’s Note: My musical tastes are sublime and don't include Evanescence…] [Author’s Retort: N-O-P-E]


We got to ranting about InfoSec things in private; initially felt that nuances in textual conversations usually get lost in translation, and one might often need to explain further. It quickly became evident that this was not the case in our discussions.


Of course, like your typical hyper-rational engineers, we instinctively started modeling our behavior - analyzing why we seem to process information very similarly, how people intellectually process things in general, how that affects the code they write, or the way they visualize technical problems, or the way they interpret security concepts. This line of thought extended to our discussion on vulnerabilities.


For the better part of the past year, I have passively been mulling over specific combinations/variations of arguments from a couple of papers, because I saw immense potential for these ideas in practical scenarios. Visualizing these arguments from the perspective of vulnerability identification and disclosure (residual thoughts from my discussion with Bailey) gave me the much-needed context that tied some things together.


In most cases, at the core, all vulnerabilities boil down to something that the developer/architect/whoever overlooked, that someone else noticed. To simplify terminology, let’s call this “someone else” an attacker, and the “developer/architect/whoever” a systems designer. The system is ultimately designed for the end-user.


The attacker might see things that the systems designer missed, because attackers visualize the system quite differently. Further, the end-user might (un)intentionally perform some action(s) that might send the system into a state not initially modeled by the systems designer. In such cases when the system does not behave as expected (and also in other cases e.g. when the end-user doesn’t get the desired functionality), the end-user often figures out workarounds to get the job done. Such workarounds routinely circumvent established security mechanisms in place too; once the system is not in a documented state, there is no saying what security measures were bypassed because of the workaround.


In essence, when analyzing from the context of actor-behavior, vulnerabilities can be the result of any (or all) of the above factors, or some combination thereof. At a glance, it looks like delineating and formalizing these factors would have some value from the perspective of vulnerability analysis.


Based on the above reasoning, we can delineate the major factors contributing to software/system vulnerabilities from the actor-behavior standpoint as follows:


First, the issue of what the systems designer doesn’t see that others might see: the blindspots. In “It’s the Psychology Stupid: How Heuristics Explain Software Vulnerabilities and How Priming Can Illuminate Developer’s Blind Spots”, Oliveira et. al. discuss the idea that “software vulnerabilities are blind spots in the developers’ heuristic decision-making process”.


Second, the issue of how the attacker-mindset differs from other actors’ in the system, and what that means. Quite a lot has been written on this topic (hacker behavior/motivations) from the perspective of sociology/psychology, law/policy, technology etc., but some interesting thoughts on how to cultivate an attacker-mindset, and what the “hacker methodology” is, are given in “What Hackers Learn That The Rest Of Us Don’t” by Sergey Bratus.


Third, the obvious existence of differential perceptions amongst various actors in the system, the resultant security circumvention and suboptimally-defended systems exposed to vulnerabilities. In “Mismorphism: A Semiotic Model Of Computer Security Circumvention (Extended Version)”, Smith et. al. examine security circumvention using a model based on semiotic triads. How differential perceptions affect systems has been explored from the perspective of security circumvention in the paper, but it got me thinking about how the same idea can also be explored in settings not necessarily involving security circumvention.


Although not all of these arguments apply directly (they all certainly apply in other ways, more on that in another post, another time perhaps) to the vulnerability we are currently discussing, I briefly touched upon them because all these issues are interrelated, and the larger issue of vulnerability identification/mitigation is better served when such component-issues are discussed together. In essence, understanding the core logic behind each of these arguments and tailoring it to apply to specific contexts might help in better vulnerability detection and mitigation. Plus, anyone looking at the same issues now has a decent starting point on where to find relevant information in case they want to explore these issues further.


That said, in the context of the vulnerability that’s currently being discussed, apart from thinking about langsec (but of course! Again, more on that some other time), further analysis of the first issue listed above concerning developer blind spots could prove quite useful. The primary argument comes from the paper “It’s the Psychology Stupid: How Heuristics Explain Software Vulnerabilities and How Priming Can Illuminate Developer’s Blind Spots” by Oliveira et. al.


The learnings from Oliveira’s paper directly play into the the remedial measures Bailey touched upon in his post - enforcing organizational coding standards, evaluating the context of each pointer, and improving coding practices etc. Rather than looking at the issue retrospectively, such as in the context of code reviews, Oliveira et. al’s paper outlines how we can prime the developers to minimize such blind spots while coding (of course, code reviews can/should still be done, but increasing the quality of the code is always the primary goal).


Oliveira’s paper explores a new hypothesis that software vulnerabilities arise due to blind spots in developers’ heuristic decision-making processes. Another hypothesis (that neatly dovetails with the former) is also investigated in tandem, as to whether priming software developers on the spot (as opposed to drawing from previous security knowledge), and alerting developers to the possibility of vulnerabilities in real time would be effective in changing developer-perspective on security, eventually making security-thinking a part of developers’ repertoire of heuristics.


This paper points out, quite rightly, that “The frequent condemnation of security education and criticism on software developers, however, do not help to reason about the root causes of security vulnerabilities”.


Psychological research shows that, due to limitations in humans’ working memory capacity,  humans often engage in heuristic-based decision-making processes. Heuristics are simple computational models that help solve problems without needing to consider all the information available. Because of their relative simplicity, heuristics require less cognitive effort, and hence they are an adaptive response to humans’ short term working memory when dealing with complex problems with a large amount of information. In such situations, due to limitations in working memory capacity, humans make “simplified, suboptimal decisions regardless of the rich information available”. We need to consider such cognitive limitations if we want developers to come up with more secure code; security education and/or code reviews alone wouldn’t be effective in making code safer.


Oliveira’s paper proves this primary hypothesis, and suggests priming, as in explicitly cueing developers on-the-spot, as an effective mechanism to eventually incorporate security-thinking as a part of developers’ cognitive processing. One of the ways the paper proposes to do this is to have developer-interfaces (such as IDEs, text editors, compilers etc.) display security information pertinent to the context of the current working scenario.


Naturally, further research needs to be done regarding what specific security information is useful, and what interfaces work best, if there are other/better ways to prime developers etc., but the point here is that more security education and more code reviews alone are not the answer to preventing such vulnerabilities.


One needs to get to the root of the problem - be it addressing systemic insecurity in the coding language, mitigating developer blind spots, or bridging differentials in actor-perspectives.


So why should we care about mechanisms factoring in actor-behavior when code reviews, semantic checkers etc. work just fine?


Firstly, they clearly don’t, at least not well enough (also, maybe things working just fine doesn’t quite cut it for some folks).


Second, this is also what someone dealing with enough vulnerability identification and mitigation might instinctually reason out (but since we technologists tend to trust empirical evidence better, the papers I cited should do the job?). For example, in the context of the current vulnerability, Bailey says the following:


“But, if engineers don't notice this, or if there is no warning message printed by the compiler, or if an IDE is being used that doesn't adequately highlight the warning messages, this can result in critical flaws in software.”


I know for a fact that he hasn’t read Oliveira’s paper before he wrote that (not even sure he read it beyond the abstract even now). In fact, looking at what happened in the code and how the whole thing played out prompted me to think about how priming could apply here, and then I saw that Bailey reasoned it out the same way too!


So yes, even in the worst case, considering that such mechanisms factoring in actor-behavior would not be useful in any other context (while *I* think that they most certainly would be) - at least a few such subclasses of fairly intractable bugs (like the current one) can be caught/mitigated more effectively.


Third, solving for mitigating a vulnerability at the source would in turn facilitate more effective mechanisms for identifying vulnerabilities. For example, if we identify the primary factors causing such vulnerabilities, we could potentially leverage that knowledge toward building more effective systematic/automatable vulnerability identification mechanisms (yes, a few formal mechanisms currently exist, but their efficacy leaves a lot to be desired, because they’re acting more as band-aids than stemming from addressing the root cause; i.e. they’re often not solving the right problem1).


What I mean to say is...  


Hence why maybe... it’s about d*** time we started looking at these issues as more than just failures in coding constructs…


< quietly sashays away and lets Bailey deal with the aftermath of any fires she lit >

Vineetha Paruchuri
M.S., Computer Science
Dartmouth College


Author’s Note: Before all ye grammar pedants come out of the woodwork to get me, the “hence why maybe” thing was intentional. (BTW Bailey, I censored out my own “damn”, thank you. Now don’t censor this “damn”, or the one I just typed; ugh, this is turning so meta). So anyway, any other (grammar) mistakes that were overlooked are totally Bailey’s fault (he seems to take “The Editor” thing a tad too seriously; so go burn him for those if you must; bye now).



“I checked it very thoroughly,” said the computer, “and that quite definitely is the answer. I think the problem, to be quite honest with you, is that you’ve never actually known what the question is.

Tuesday, July 12, 2016

Quick PokemonGO Threat Modeling

Why I Caught Pokemon All Day Long Today

Most of y'all know by now I've got a four week old little man by my side 24/7, and it's the best thing ever. It also means that almost 100% of my time consists of: feeding, burping, changing, sleeping, or hacking. The precious free minutes I get are spent at the gym or running in the park, just to keep in some semblance of shape. This is why I was thrilled when PokemonGO came out. 

Aside from simply being fun as balls, it's a fun game I can play with my baby even though he has no idea what is going on. We can wander around the park, our neighborhood, or wherever, doing something. While a childless reader may think to themselves "why not just walk them around without Pokemon?", there is an important reason: boredom. 

Fun. As balls. 
While my baby boy is amazing and I love him far more than I could ever imagine loving a tiny creature that can't even really see me, there is only so much you can do with a baby this little. So, when you're walking around the park or the `hood, you can only explain trees, flowers, birds, and squirrels so many times. It gets repetitive. 

Enter Pokemon GO. 

Now, I can still tell him all kinds of things about nature and life, but we can do something else, too. It provides a nice contrast and an escape, while getting outside and walking around together. It introduces a cute randomness into an hour out of our day and that's great. 

This is why I was really annoyed with certain personalities from the information security industry getting all touchy on Twitter today about the latest in Poke-tech.

The Infosec Team Rocket

Pokemon GO had been getting a bad rep in the infosec scene for a few days even before today's Google permissions explosion. People's irrational (and sometimes rational) fears have been the major focus of the issues discussed on Twitter. I won't bother listing or debating them here, though some of them are a real part of a Pokemon GO threat model, and I'll discuss them later. 

Dammit.
But, the real rain on everyone's parade came today, when it was discovered that the Pokemon GO app requests permissions from a Google account that far exceed what it needs to function. And yes, that is an oversight and must absolutely be corrected. In fact, Niantic has already released a statement about their interpretation of the issue. The following sentence from the statement is, perhaps, the most important one:

Google has verified that no other information has been received or accessed by Pokémon GO or Niantic. 

This assertion by Niantic claims that Google is verifying that Niantic has not abused its access to Google user's accounts. And this was the primary reason why I wasn't concerned: Google isn't a stupid company. They have exceptional security engineers, many of whom we are friends with, either directly or indirectly. They also have a strict permissions model and monitoring subsystem for applications, to identify if there are (or were) abuses. If Google is backing Niantic's claim that no abuses have occurred, I believe them. 

But, for a minute here, let's ignore what is likely to have happened, and look at what could have happened

Playing What If

The reason why so many researchers were e-screaming their digital heads off today was because of the what if factor. What if an adversary compromised a phone with Pokemon GO loaded on it and captured the OAuth token that granted full access to the Google account? What if an adversary or insider was able to compromise the back-end database and usurp a massive cache of Uber-Mode security tokens? What if Psyduck came to life and actually entranced the Niantic and Google security teams and made off with all their juicy security tokens? What if?!?!

Madness and Indecision
First of all, this is why there are stringent security controls on iOS, Android, and any other modern mobile platform (including the Lab Mouse Security HarvestOS, which will be publicly discussed in the coming weeks). Platform security controls are supposed to disallow a malicious app from accessing security tokens for a separate app. In theory this model works. In practice, it sometimes works, but it depends on the platform. iOS is much better about application security and cross-application attacks. Android, however, is less successful at security in this area, but still does a sufficient job in most cases. 

If I were using an up-to-date Android firmware on a fairly modern device, I would feel mostly secure. If I were using a modern iOS device, and an up-to-date iOS image, which I am, I will feel pretty damn secure about playing a damn video game. An attacker subverting valid and unaltered system controls on these platforms is unlikely because of the level of expertise required in accomplishing the task. If the hacker is targeting me personally because I am a known infosec personality, and they are sufficiently skilled, there is probably little I can do to thwart them. So it goes. 

More importantly, the people upset about this part of the threat model forget one key fact: anyone skilled enough to bypass platform security controls at this level can do a lot worse than snatch an OAuth token for a stupid video game app. They can manipulate the entire phone, and usurp its functionality to control the Apple or Google account, anyway. So, by that token, who gives a shit? Your entire phone is owned anyway. The Google account is probably the least of your worries, not to mention that if you are using an Android phone, or an iOS with Google Apps installed on it, an OAuth token with high privileges probably already exists on your device. 

Break that Jail, Ash!
The other part to this argument, which is valid, is with respect to jailbreaking. For those unfamiliar, jailbreaking is the process of subverting a smart phone's security in order to run custom firmware. There are valid reasons to do this: freedom of choice, breaking out of regional restrictions, accessing unofficial apps, and subverting carrier controls. Yet, the user is knowingly subverting the platform's security model. That is what jailbreaking is. So, if the user is knowingly doing this, they are putting themselves at risk of decreased security by invalidating the controls used on the platform. Malware attacks, hacks, spyware, and other issues are all a valid concern for any user with a jailbroken phone. So, again, if the user is just worried about their damn Pokemon GO app, they are thinking incorrectly about their threat model. 

The final argument, which does have merit, is in regards to so called "un-tethered" jailbreaks. An un-tethered jailbreak is a jailbreak that can occur without physical access to the phone. In other words, it can be used as an attack. A user that hasn't kept their phone up-to-date with the latest firmware image may be susceptible to an un-tethered jailbreak attack by visiting a malicious website, or some other means. This attack can render the device jailbroken, and may allow an adversary to remotely control the phone. Yet, again, this is an attack on the phone itself and not on Pokemon GO. So, these users are susceptible to far worse compromise than their stupid video game app. 

Building a Legitimate Threat Model

So, now that we have a better understanding of Twitter's concerns, let's take a look at real concerns with Pokemon GO. Here is a brief and messy threat model:

Application Based:
  • Login Credentials (Google or Pokemon Server)
  • Application Secure Storage
  • Communication Security
API Based:
  • Credential Leakage
  • Metadata Extraction
  • Authentication and Authorization
  • Partners
Game Based:
  • Physical location tracking
  • Trainer to Human conversion
  • Baiting

Application Based Risks

In all honesty, this is the part of the model that I am least concerned about. Yet, this is the part that Twitter has been most vocal about.

Login Credentials

Regardless of which server type is used, critical data can be extracted. Since presumably more kids are associate with Pokemon server accounts, their personal information must be sufficiently guarded. It isn't OK to trivialize the Pokemon server tokens while exaggerating the Google ones. Both must be protected as both tokens potentially expose critical data. 

However, to access the token, we know that:
  • Smartphone security must be subverted by an adversary
  • Back-end security must be subverted by an adversary
  • Communications security must be subverted by an adversary
As described above, smartphone security is a risk. Yet, actually exploiting the risk is difficult for the set of users that don't have out of date firmware or ancient phones. All users should update to latest firmware images, sharply decreasing the risk of an attack.

While I cannot speak for Niantic's internal security, Google is now known to be assisting in monitoring the data extracted from the OAuth tokens. This will at least help identify if abuses do occur, and will allow Google to assist in prevention. This decreases the potential for back-end or internal abuses. For users with long-term concerns, they can simply invalidate the OAuth token by logging into their Google account and revoking access to Pokemon GO in their security settings. 

Communications security is another story. There are already reports of users abusing TLS in the Pokemon GO app because of a lack of certificate pinning. Niantic should review their TLS deployment to ensure there are no potential abuses that could result in an attack against communication between smartphones and the Niantic servers. But, for now, there is no known MITM attack, only attacks where the app is instrumented by a researcher. 


Application Security Store

If an app uses the underlying application security mechanisms provided by the platform, and does so in accordance with the platform's security guides, the application has done the best it can do. If the application has not adhered to these recommendations and guides, malicious applications may be able to subvert programmatic flaws in order to gain access to useful data. Yet, again, this would require direct access to the device by a malicious application or a physical user. 


Communications Security

Yes, this is important. Certificate pinning should be used. It, apparently, is not currently employed by the Niantic app. As of writing this blog there are currently no reports of off-device MITM capability between the endpoint and the back-end servers. 


API Security

This is the second most important part of the threat model. The state of the Niantic API is unknown. Aside from there being no reports of security failures in the TLS implementation (aside from certificate pinning) there is little information about the actual protocol and how the API validates requests issued against it. 

Since there is no information about the Niantic API, and I am not going to go pokemoning around their services to determine if they are weak (as I don't have permission) this part of the threat model is currently marked <acceptable risk>

This is a very important concept, "acceptable risk". This is the idea that you know there may be a security risk, but you choose to accept it and move on with your life. The fact is, any of the services we use could, at any time, be at risk of manipulation by an adversary. And, the fact is, most of them are at risk many times throughout our lives without us knowing about it. Flaws happen, and they are going to happen. That's just how the world works. Until that changes, unknowns that we cannot compensate for take on one of two forms: accept the risk or reject the product or service. In this case, I choose you, Pikachu. 


Physical Security

This is the most important aspect of the Pokemon GO threat model, and, perhaps, not for the reason the reader might presume. 

The security of the physical endpoint is masked by the controls of the platform and the controls of the application. Since Pokemon GO has no physical component other than the smartphone itself, there is no specific hardware identifier that is associated with a Pokemon Trainer in the game. Rather, the application identities are associated with metadata that may have a relation to a physical device. That is all a part of the unknown API. Yet, those translations are all a part of the hidden abstractions that are presumably secured via TLS and Niantic's back-end services security model. We shouldn't have to worry about them, and if we do have to worry about them, the entire security model is broken. 

There is a much more fascinating angle, however: Bluetooth. 

It has been reported that Pokemon GO will include a Bluetooth wearable device that synchronizes with the game. This device would light and/or buzz when a Pokemon is nearby. This allows the user to walk around without having to constantly stare at their phone. Instead, the user can interact with the wearable device, or choose to look at the smart phone only when an event has occurred. 

Yet, the device would use Bluetooth Low Energy (BLE) to communicate with the phone. BLE's range is surprisingly long, as I recently found out in designing and manufacturing Lab Mouse Security's custom BLE module. In addition, the BLE wearable must be able to synchronize with the endpoint. If the wearable isn't paired because the phone is off or sleeping, it may emit a beacon identifying what it is. This, and the BLE radio address (MAC), may allow an adversary to physically track the device in the real world. This would allow an adversary to sit at a Pokemon Gym intercepting BLE addresses, then tracking those addresses back to physical locations (homes). 

This is a legitimate concern not just for me, but for the entire Bluetooth SIG. To combat these problems, the SIG has made enhancements to the BLE spec. One of these enhancements is generating a random MAC so that a device cannot be tracked. A unique MAC can be used to communicate with other devices, ensuring that the device cannot be linked to a specific user. However, this MAC isn't autogenerated and renegotiated at every session, only sometimes. Plus, the radio firmware must support this feature. As a result, if a phone and BLE wearable are always on and paired, they will likely always use the same random MAC to communicate, negating the benefit of this feature. The feature is imperative and must be a part of the BLE spec, but its implementation must be heavily vetted to ensure that it cannot accidentally be turned into a trackable beacon. 

Beacons, yo.
Another issue is the security token or public key used to negotiate a secure session. The Bluetooth SIG has added secure session negotiation using ECDH, which, again, is excellent. However, this only secures the negotiation of the session layer and doesn't actually reduce the threat of attack in the event that the core keys used are either guessable or static across a deployment of devices. If the key provisioning or personalization steps are flawed, the resultant set of BLE wearable devices will have flawed communication that can be intercepted and decrypted by anyone with the ability to attack a single physical device. Then, a simple radio trick can be used to perform a MITM attack against BLE communication. It is notable, however, that this attack requires a high level of expertise, time, and equipment, and is unlikely to occur. 

And yet, if the same security token is always emitted by a BLE wearable in order to negotiate a connection with a smartphone or peer device, a simple and cheap ($20) BLE sniffer may be able to track the wearable device. Or, at least, map the device to its current MAC, then allow the adversary to track the MAC. 


Wrapping it Up

Thus, when the wearable device does come out, I hope the Pokemon GO team at Niantic heavily audits the BLE device to ensure that security is properly implemented. This way, our super fun and awesome video game won't turn into a very simple tool for tracking kids back to their homes. But, with Google involved, I'm sure a plan is already in place. 

But, if one isn't, Lab Mouse Security wrote the GSMA IoT Security Guides, based off our DARPA grant, and is ready to assist with the security review. 

But, for now, here is a set of recommendations for Pokemon GO users:
  • Generate a Google account just for Pokemon GO, don't use a work or personal account
  • Don't use the Pokemon GO app at your home, only in public places, to reduce the potential for stalking
  • Don't use a jailbroken phone
  • Keep your phone's firmware updated
  • Never go to isolated places alone
  • Never go to parks, poorly lit areas, or isolated locations at night
  • Always Pokemon GO with a buddy
  • Hide your Pokemon Trainer name when taking screenshots by using an image editing app, such as Photoshop
Keep catching them all! 

Best,
Don A. Bailey
CEO / Founder
Lab Mouse Security