Archive
How not to protect your app
So, I went to Droidcon this week.
And to be honest, it disappointed me almost in every parameter: from content to catering.
I don’t go to many conventions, but compared to August penguin that costs about one tenth for a ticket, Droidcon was surprisingly low quality.
The agenda had not one but two presentations on how to “protect your app from hackers”, and unfortunately, both could be summed up in one word: obfuscate.
Obfuscation is the worst way to secure any software, and Android applications are no different in this regard. If anything, since smartphones today often contain more sensitive and valuable user data then PCs, it is even more important to use real security in mobile apps!
Obfuscation is bad because:
- It is difficult and costly to implement but relatively easy to break.
- It prevents, or at least significantly reduces your ability to debug your app and resolve user reported issues.
- It only needs to be broken once to be broken everywhere, always for any user.
- Fixing it once broken is almost impossible.
Basically, obfuscation is what security experts call “security by obscurity” which is considered very insecure.
Consider this: the entire Internet, including most banking and financial sites runs almost completely on open source software and standard, open, and documented protocols: Apache, NGINX, OpenSSL, Firefox, Chrome.
OpenSSL is particularly interesting, because it provides encryption that is good enough for the most sensitive information on the net, yet it is completely open.
Even the infamous “Dark Web” or more specifically the Tor network, is completely open source.
Truly secure software design does not rely on others not understanding what your code does.
Even malware writers, who’s entire bread and butter depends on hiding what their apps are doing, no longer rely on obfuscation but instead moved on to full blown code encryption and delivering code on demand from a server.
This is because obfuscated code was too easy for security researchers, and even fully automated malware scanners to detect.
If you want to learn more about this, follow security company blogs such as Semantek, Kaspersky and Checkpoint. Their researchers sometimes publish very interesting malware analysis showing the tricks malware writers use to hide their evil code.
Specifically bad advice
Now I would like to go over some specific advice given in one presentation, that I consider to be particularly bad, some of it to the point where I would call it “anti-advice”:
- Use reflection
- Hide things in “native”
- Hide data with protobufs or similar
- Hide code with ProGuard or similar
I listed these “recommendations” from most to least harmful.
If you are considering using any of these technics in your app, please read the following explanation before doing so, and reconsider.
1. Using reflection
Reflection is a powerful tool, but it was not designed to hide code.
If you use reflection to call a method in a class, anyone looking at a disassembled version of your app, or even running simple “strings” utility on it will still see all the method and class names.
You will gain nothing but loose any protection from bugs and crashes that compiler checks and lint tools normally provide.
You could go as far as scramble (or even fully encrypt) the strings holding the names of the classes and methods you call with reflection and only decipher them at run time.
But this takes long time to implement, is very error prone, and will make your app slower since it has a lot more work to do for a single, simple function call.
Stop and think carefully: what would a “hacker” analyzing your app gain from knowing what API you are calling?
The answer will always be: nothing, unless your app is very badly written!
Most security apps, like password managers, advertise exactly what encryption they are using so their customers would know how secure they are.
In a truly secure app, knowing what the app does, will not help a bad guy break it in any way.
I challenge anyone to give me an example in the comments of an API call that is worth hiding in a legitimate app.
2. Hide things in “native”
If you are not familiar with JNI and writing C and C++ code for Android, go read up on it, but not in order to protect your app!
Because Java (and Kotlin) still have performance issues when it comes to certain types of tasks (specifically, games and graphics related code), developers at Google created the Native Development Kit – NDK, to let you include native C and C++ code in your app that will run directly on the device processor and not the JVM (Dalvik / ART).
But just as with reflection, the NDK was not designed to hide your code from prying eyes.
And thus, it will not hide anything!
To many Java developers, particularly ones with no experiences or knowledge developing in native (to the hardware) languages like C, a SO binary file will look like complete gibberish even in a hex editor.
But, just because you don’t understand what you are looking at, does not mean someone trying to crack your app will have a hard time understanding it!
A so file is a standard library used on Linux systems for decades (remember: Android is running on top of the Linux kernel), so there are lots of tools out there to decompile and analyze them.
But your attacker probably won’t even need to decompile your so file. If all they are looking for is some string, like a hard-coded password or API token you put in your app, they can still see it with a simple “strings” utility, same as they would in your Java or Kotlin code. There is no magic here – all strings remain intact when compiling to native.
Also, any external functions or methods your native code calls will appear in the binary file as plain text strings – your OS needs to find them to call them, so they will be there, exposed just like in any other language!
But things can get worse: lets say you have some valuable business logic in your app, and you want to make it harder for hackers to decompile your code and see this logic.
It is true that when you compile Java source, a lot more information about the original code is preserved than when compiling C or C++.
But don’t be tempted by this, because you may leave your self even more exposed!
If a hacker really wants to run your code for his (or hers) nefarious purposes, then wrapping it up in a library that can easily be called from any app is like gift wrapping it for them.
And this is exactly what you will be doing if you put your code in a native SO file: you are putting it in an easy to use library!
Instead of decompiling your code and rewriting it in their app, the hacker can just take your SO file, and call its functions (that must be exposed to work!) from their own app.
They don’t need to know what your code does, they can just feed it parameters and get the result, which is what they really wanted in the first place.
So instead of hampering hacking, you make it easier by using the wrong tool, all the while giving your self a false sense of security.
4. Hide data with protobufs or similar
By now you should have noticed a pattern: one thing all these bad advice have in common is recommending the wrong tool for the job.
Protobufs is an excellent open source tool for data serialization.
It is not a security tool!
The actual advice given in the presentation was to replace JSON in server responses with protobufs in order to make the information sent by the server less readable.
But what security do you gain from this? If your server sends a reply like this:
{ "first_name" : "Jhon", "last_name" : "Smith", "phone" : "555-12345", "email" : "jhon@email.com" }
converting this structure to a protobuf will look something like this:
xxxxJhonxxxxSmithxxxx555-12345xxxjhon@email.com
Is that really hiding anything?
Protobufs are more compact then JSON, and they can be deserialized faster and easier than JSON but they also have some disadvantages: they are not as flexible as JSON.
It is hard to support optional fields with protobuffs and even harder to create dynamic or self describing objects.
If your app needs flexibility in parsing server replies, or if you have other clients, particularly web clients written with JS that access the same server API, JSON may be the better choice for you.
When deciding whether to use JSON or protobuffs, consider their advantages and disadvantages for your use case, DO NOT CONSIDER SECURITY!
They are both equally insecure, and you will need encryption (always use SSL!) and proper access validation (passwords, tokens, client certificates) if you want to keep your data safe.
4. Hide code with ProGuard or similar
This advice actually talks about the right tool for a change: ProGuard.
This is a tool Google ships with Android Studio, and it does two things: reduces the size of your code and resources after compilation, and slightly obfuscates your code.
This is not a bad tool, but it comes with a cost, and it won’t really give you protection from hackers.
It will rename your methods like getMySecretPassword()
to a()
but will that really stop anyone from doing anything bad?
At best, it will slow them down, but keep in mind that it will also cost you:
ProGuard has the side effect of rendering all stack traces useless and making debugging the app extremely difficult.
There is a way to mitigate this: you need to keep a special translation file for every single build of your app (because ProGuard randomizes its name mangling).
If you need to support users in production and don’t want to be helpless or work extra hard when they report a crash, you might want to give up on ProGuard.
Also keep in mind that you need to carefully tell ProGuard what not to obfuscate, since you must keep any external API calls, components declared in the manifest and some third party library calls intact, or your app will not run.
Remember – ProGuard will:
- Not keep any hardcoded strings safe.
- Not keep your user password safe if you store it as plain text in your app data folder.
- Not keep your communication safe if you do not use SSL.
- Not protect you from MITM attacks if you do not use certificate pinning.
ProGuard might make your final APK file smaller by getting rid of unused code and reducing length of class and methods names, but you should carefully consider the cost of this reduction you will pay when dealing with bugs and crashes.
I find it is usually just not worth the hassle.
And there are better tools now for reducing download size, such as App Bundles.
Summary
Messing with your code will never make your app more secure. It will not protect you from hackers.
Even if you do not want, or can not, release your app as open source, you still need to remember that trying to hide its code with obfuscation will cost you more then having your app reverse engineered.
The development, debugging, and user support costs can be as devastating as any hacks!
But, if you treat your code as though it is meant to be open, and make sure that even if a bad person can read and understand everything your app does they still can not get your users data or exploit your web server, then, and only then, will your app be truly secure.
And doing that is often easier and cheaper than trying to obfuscate your code or data.
P. S.
One of the presentations mentioned a phenomenon I was not familiar with: “App cloning”.
Apparently, if you publish an ad supported app, some bad people can take your app without your permission, replace your advertisement API keys with their own, and release the app to some unofficial app stores like the ones that are common in China (because Google Play is blocked there by the government).
This way, they will get ad revenue from your app instead of you.
But consider this: would you publish your app to these stores?
If your answer is “no”, then you are not losing anything!
You will never get any money from these users because they will never be able to install your original app, so any effort you put in defending against “cloning” will be a net financial loss to you.
Remember – as a developer, your time is money!
P. S. 2
Someone in the audience asked about Google API keys like the Google Maps API key.
Usually, it is bad practice to put API keys in plain text in the manifest of your app, because anyone can get them from there and use a paid API at your expense.
But this is not the case with Google API keys!
The reason Google tells you to put the key in the manifest, is because Google designed these API keys in such a way, that they will be useless to anyone but you, so stealing them is pointless.
This is a great example of a good security design: instead of relying on app developers to figure out how to distribute an API key to millions of users but keep it safe from hackers, Google tide the key to your signing certificate and your app id (package name).
When you create the API key, you must enter your certificate fingerprint and your package name.
Your private key – the one you use to sign your apps for release, is something most developers already keep very safe. There is never a reason to send it anywhere and it would never be included in the app itself.
It will stay safe on the developers computer.
And without this private key, the public API key will not work.
If it is used in an app signed by anyone else, even if that app fakes your app’s id, the API key will still be invalid.
This is how you secure apps!
Sneaking features through the back door
Sometimes programming language developers decide that certain practices are bad, so bad that they try to prevent their use through the language they develop.
For example: In both Java and C# multiple inheritance is not allowed. The language standard prohibits it, so trying to specify more than one base class will result in compiler error.
Another blocking “feature” these languages share, is a syntax preventing creation of derivative classes all together.
For Java, it is declaring a class to be final
which might be a bit confusing for new users, since the same keyword is used to declare constants.
As an example, this will not compile:
public final class YouCanNotExtendMe { ... } public class TryingAnyway extends YouCanNotExtendMe { ... }
For C# just replace final
with sealed
.
This can also be applied to specific methods instead of the entire class, to prevent overriding, in both languages.
While application developers may not find many uses for this feature, it shows up even in the Java standard library. Just try extending the built-in String
class.
But, language features are tools in a tool box.
Each one can be both useful and misused or abused. It depends solely on the developer using the tools.
And that is why as languages evolve over the years, on some occasions their developers give up fighting the users and add some things they left out at the beginning.
Usually in a sneaky, roundabout way, to avoid admitting they were wrong or that they succumbed to peer pressure.
In this post, I will show two examples of such features, one from Java, and one from C#.
C# Extension methods
In version 3.0 of C# a new feature was added to the language: “extension methods”.
Just as their name suggests, they can be used to extend any class, including a sealed
class. And you do not need access to the class implementation to use them. Just write your own class with a static method (or as many methods as you want), that has the first parameter denoted by the keyword this
and of the type you want to extend.
Microsoft’s own guide gives an example of adding a method to the built in sealed
String
type.
Those who know and use C# will probably argue that there are two key differences between extension methods and derived classes:
- Extension methods do not create a new type.
Personally, I think that will only effect compile time checks, which can be replaced with run time checks if not all instances of the ‘base’ class can be extended.
Also, a creative workaround may be possible with attributes. - Existing methods can not be overridden by extension methods.
This is a major drawback, and I can not think of a workaround for it.
But, you can still overload methods. And who knows what will be added in the future…
So it may not be complete, but a way to break class seals was added to the language after only two major iterations.
Multiple inheritance in Java through interfaces
Java has two separate mechanisms to support polymorphism: inheritance and interfaces.
A Java class can have only one base class it inherits from, but can implement many interfaces, and so can be referenced through these interface types.
public interface IfaceA { void methodA(); } public interface IfaceB { void methodB(); } public class Example implements IfaceA, IfaceB { @override public void methodA() { ... } @override public void methodB() { ... } } Example var0 = new Example(); IfaceA var1 = var0; IfaceB var2 = var0;
But, before Java 8, interfaces could not contain any code, only constants and method declarations, so classes could not inherit functionality from them, as they could by extending a base class.
Thus while interfaces provided the polymorphic part of multiple inheritance, they lacked the functionality reuse part.
In Java 8 all that changed with addition of default
and static
methods to interfaces.
Now, an interface could contain code, and any class implementing it would inherit this functionality.
It appears that Java 9 is about to take this one step further: it will add private
methods to interfaces!
Before this, everything in an interface had to be public
.
This essentially erases any differences between interfaces and abstract classes, and allows multiple inheritance. But, being a back door feature, it still has some limitations compared to true multiple inheritance that is available in languages like Python and C++:
- You can not inherit any collection of classes together. Class writer must allow joined inheritance by implementing the class as interface.
- Unlike regular base classes, interfaces can not be instantiated on their own, even if all the methods of an interface have default implementations.
This can be easily worked around by creating a dummy class without any code that implements the interface. - There are no
protected
methods.
Maybe Java 10 will add them…
But basically, after 8 major iterations of the language, you can finally have full blown multiple inheritance in Java.
Conclusion
These features have their official excuses:
Extension methods are supposed to be “syntactic sugar” for “helper” and utility classes.
Default method implementation is suppose to allow extending interfaces without breaking legacy code.
But whatever the original intentions and reasoning were, the fact remains: you can have C# code that calls instance methods on objects that are not part of the original object, and you can now have Java classes that inherit “is a” type and working code from multiple sources.
And I don’t think this is a bad thing.
As long as programmers use these tools correctly, it will make code better.
Fighting your users is always a bad idea, more so if your users are developers themselves.
Do you know of any other features like this that showed up in other languages?
Let me know in the comments or by email!
Beware Java’s half baked generics
Usually I don’t badmouth Java. I think its a very good programming language.
In fact, I tend to defend it in arguments on various forums.
Sure, it lacks features compared to some other languages, but then again throwing everything including a kitchen sink in to a language is not necessarily a good idea. Just look at how easy it is to get a horrible mess of code in C++ with single operator doing different things depending on context. Is &some_var
trying to get address of a variable or a reference? And what does &&some_var
do? It has nothing to do with the boolean AND operator!
So here we have a simple language friendly to new developers, which is good because there are lots of those using it on the popular Android platform.
Unfortunately, even the best languages have some implementation detail that will make you want to lynch their creators or just reap out your hair, depending on whether you externalize your violent tendencies or not.
Here is a short code example that demonstrates a bug that for about 5 minutes made me think I was high on something:
HashMap<Integer, String> map = new HashMap<>(); byte a = 42; int b = a; map.put(b, "The answer!"); if (map.containsKey(a)) System.out.println("The answer is: " + map.get(a)); else System.out.println("What was the question?");
What do you expect this code to print?
Will it even compile?
Apparently it will, but the result will surprise anyone who is not well familiar with Java’s generic types.
Yes folks – the key will not be found and the message What was the question?
will be printed.
Here is why:
The generic types in Java are not fully parameterized. Unlike a proper C++ template, some methods of generic containers take parameters of type Object, instead of the type the container instantiation was defined with.
For HashMap
, even though it’s add
is properly parameterized and will raise a compiler error if the wrong type key is used, the get
and containsKey
methods take a parameter of type Object
and will not even throw a runtime exception if the wrong type is provided. They will simply return null
or false
respectively as if the key was simply not there.
The other part of the problem is that primitive types such as byte
and int
are second class citizens in Java. They are not objects like everything else and can not be used to parameterize generics.
They do have object equivalents named Byte
and Integer
but those don’t have proper operator overloading so are not convenient for all use cases.
Thus in the code sample above the variable a
gets autoboxed to Byte
, which as far as Java is concerned a completely different type that has nothing to do with Integer
and therefore there is no way to search for Byte
keys in Integer
map.
A language that implements proper generics would have parameterized these methods so either a compilation error occurred or an implicit cast was made.
In Java, it is up to you as a programmer to keep you key type straight even between seemingly compatible types like various size integers.
In my case I was working with a binary protocol received from external device and the function filling up the map was not the same one reading from it, so it was not straight forward to align types everywhere. But in the end I did it and learned my lesson.
Maybe this long rant will help you too. At least until a version of Java gets this part right…