Obfuscation: String Encryption

by Jason Haley 16. June 2006 00:27

Last time I discussed renaming, this time I want to discuss String Encryption. String Encryption is the process of encrypting quoted strings that appear in your code, these strings are stored in the #US heap in an assembly. The #US heap is made up of strings that are referenced in the IL (or when you write your code these strings are in your methods, properties, etc.). This means when you disassemble or decompile an assembly you will see these strings in plain text. For example, if I want to get around some part of your system that gives me a message like "Invalid Password, Please try again", first thing I would do is disassemble the assembly to an IL file and search for that text. Once I find that message I can start to figure out how to jump over it ... I'm sure you get the point. String encryption is an obfuscation technique that encrypts the strings so they are not simple to read.

For example, if you take some simple C# code like this:

    8 public class TestCustomer

    9 {

   10     public Customer CreateDefaultCustomer()

   11     {

   12         Customer c = new Customer();

   13 

   14         c.FirstName = "Jason";

   15         c.LastName = "Haley";

   16         c.Company = string.Empty;

   17 

   18         return c;

   19     }

   20 }

Compile the code and open that assembly up in Reflector, you will see something like this:

In this case Reflector does a great job showing you a C# representation of my original C#. If you want to see what the #US heap looks like, open the assembly up in ILDasm then go to the View Menu->MetaInfo->Raw:Heaps then View Menu->MetaInfo->Show!. The #US heap is the last one in the listing and looks like this:

User Strings
-------------------------------------------------------
70000001 : ( 5) L"Jason"
7000000d : ( 5) L"Haley"
If you look at the CreateDefaultCustomer() method in ILDasm (plus turning on the show tokens) you will see the following (notice the token by the string that links it to the #US heap): 

.method /*06000016*/ public hidebysig instance class SouthRain.Customer/*02000003*/

        CreateDefaultCustomer() cil managed

{

  // Code size       49 (0x31)

  .maxstack  2

  .locals /*11000003*/ init ([0] class SouthRain.Customer/*02000003*/ c,

           [1] class SouthRain.Customer/*02000003*/ CS$1$0000)

  IL_0000:  nop

  IL_0001:  newobj    instance void SouthRain.Customer/*02000003*/::.ctor() /* 06000015 */

  IL_0006:  stloc.0

  IL_0007:  ldloc.0

  IL_0008:  ldstr      "Jason" /* 70000001 */

  IL_000d:  callvirt   instance void SouthRain.Customer/*02000003*/::set_FirstName(string) /* 06000002 */

  IL_0012:  nop

  IL_0013:  ldloc.0

  IL_0014:  ldstr      "Haley" /* 7000000D */

  IL_0019:  callvirt   instance void SouthRain.Customer/*02000003*/::set_LastName(string) /* 06000004 */

  IL_001e:  nop

  IL_001f:  ldloc.0

  IL_0020:  ldsfld    string [mscorlib/*23000001*/]System.String/*01000013*/::Empty /* 0A000011 */

  IL_0025:  callvirt   instance void SouthRain.Customer/*02000003*/::set_Company(string) /* 06000006 */

  IL_002a:  nop

  IL_002b:  ldloc.0

  IL_002c:  stloc.1

  IL_002d:  br.s       IL_002f

  IL_002f:  ldloc.1

  IL_0030:  ret

} // end of method TestCustomer::CreateDefaultCustomer

So now you know ... the text is pretty easy to spot. Now let's see what it looks like after we have an obfuscator encrypt the strings. First, look at it in Reflector again:

It is a harder to read now ... if you look at the #US heap you'll see that the string seems to be just a bunch of bytes in there:
User Strings
-------------------------------------------------------
70000001 : ( 5) L".."
  User string has unprintables, hex format below:
  246d 116f 0171 1b73 1875
7000000d : ( 5) L"."
  User string has unprintables, hex format below:
  266d 116f 1e71 1173 0f75

If you look at the same method now, you can see the code size has gone from 49 bytes to 76 bytes. Since this is a pretty small method it has almost doubled in size, but that isn't always going to be the case. The code that has been added is needed to decrypt the string before it gets used in your code.

.method /*06000019*/ public hidebysig instance class SouthRain.Customer/*02000004*/

        CreateDefaultCustomer() cil managed

{

  // Code size       76 (0x4c)

  .maxstack  3

  .locals /*11000004*/ init (class SouthRain.Customer/*02000004*/ V_0,

           class SouthRain.Customer/*02000004*/ V_1,

           int32 V_2)

  IL_0000:  ldc.i4    0x0

  IL_0005:  stloc      V_2

  IL_0009:  nop

  IL_000a:  newobj    instance void SouthRain.Customer/*02000004*/::.ctor() /* 06000018 */

  IL_000f:  stloc.0

  IL_0010:  ldloc.0

  IL_0011:  ldstr      bytearray (6D 24 6F 11 71 01 73 1B 75 18 )                   // m$o.q.s.u. /* 70000001 */

  IL_0016:  ldloc      V_2

  IL_001a:  call       string a$PST06000001(string,

                                            int32) /* 06000001 */

  IL_001f:  callvirt   instance void SouthRain.Customer/*02000004*/::set_FirstName(string) /* 06000005 */

  IL_0024:  nop

  IL_0025:  ldloc.0

  IL_0026:  ldstr      bytearray (6D 26 6F 11 71 1E 73 11 75 0F )                   // m&o.q.s.u. /* 7000000D */

  IL_002b:  ldloc      V_2

  IL_002f:  call       string a$PST06000001(string,

                                            int32) /* 06000001 */

  IL_0034:  callvirt   instance void SouthRain.Customer/*02000004*/::set_LastName(string) /* 06000007 */

  IL_0039:  nop

  IL_003a:  ldloc.0

  IL_003b:  ldsfld    string [mscorlib/*23000001*/]System.String/*0100000D*/::Empty /* 0A000013 */

  IL_0040:  callvirt   instance void SouthRain.Customer/*02000004*/::set_Company(string) /* 06000009 */

  IL_0045:  nop

  IL_0046:  ldloc.0

  IL_0047:  stloc.1

  IL_0048:  br.s       IL_004a

  IL_004a:  ldloc.1

  IL_004b:  ret

} // end of method TestCustomer::CreateDefaultCustomer

You can look at the code and compare the lines to see what has been added (if you want), but the Reflector representation shown above will give you the basics of what it is doing. When you encrypt the strings, a decrypt function is added (usually static so it isn't too much addition), and decryption code is added before loading the string onto the stack. So it comes down to the fact that strings are now harder to figure out (and practically impossible to search for) but some code is added making the IL and assembly bigger. Tradeoffs ... always tradeoffs.

Now that you have the idea of what string encryption is, I want to mention a tip about using runtime (readonly) constants instead of compile-time (const). If the fact that Bill Wagner in "Effective C#: 50 Specific Ways to Improve Your C#" makes Item #2: "Prefer readonly to const" isn't good enough for you, then check how about this interesting example ...

If you have some code using const like the following:

   23 public class TestCustomerWithConst

   24 {

   25     const string FirstNameDefault = "Jason";

   26     const string LastNameDefault = "Haley";

   27     const string CompanyDefault = "";

   28 

   29     public Customer CreateDefaultCustomer()

   30     {

   31         Customer c = new Customer();

   32 

   33         c.FirstName = FirstNameDefault;

   34         c.LastName = LastNameDefault;

   35         c.Company = CompanyDefault;

   36 

   37         return c;

   38     }

   39 }

String encryption will work the same for the methods that use the constant, as you can see in Reflector:

The strings in the #US heap also seem to look fine (except there is now a new one):

User Strings
-------------------------------------------------------
70000001 : ( 5) L"..."
  User string has unprintables, hex format below:
  377c 1e7e f280 ec82 eb84
7000000d : ( 5) L".."
  User string has unprintables, hex format below:
  357c 1e7e ed80 e682 fc84
70000019 : ( 0) L""

The IL for the method also looks OK ...

.method /*06000019*/ public hidebysig instance class SouthRain.Customer/*02000004*/

        CreateDefaultCustomer() cil managed

{

  // Code size       76 (0x4c)

  .maxstack  3

  .locals /*11000004*/ init (class SouthRain.Customer/*02000004*/ V_0,

           class SouthRain.Customer/*02000004*/ V_1,

           int32 V_2)

  IL_0000:  ldc.i4    0xc

  IL_0005:  stloc      V_2

  IL_0009:  nop

  IL_000a:  newobj    instance void SouthRain.Customer/*02000004*/::.ctor() /* 06000018 */

  IL_000f:  stloc.0

  IL_0010:  ldloc.0

  IL_0011:  ldstr      bytearray (7C 37 7E 1E 80 F2 82 EC 84 EB )                   // |7~....... /* 70000001 */

  IL_0016:  ldloc      V_2

  IL_001a:  call       string a$PST06000001(string,

                                            int32) /* 06000001 */

  IL_001f:  callvirt   instance void SouthRain.Customer/*02000004*/::set_FirstName(string) /* 06000005 */

  IL_0024:  nop

  IL_0025:  ldloc.0

  IL_0026:  ldstr      bytearray (7C 35 7E 1E 80 ED 82 E6 84 FC )                   // |5~....... /* 7000000D */

  IL_002b:  ldloc      V_2

  IL_002f:  call       string a$PST06000001(string,

                                            int32) /* 06000001 */

  IL_0034:  callvirt   instance void SouthRain.Customer/*02000004*/::set_LastName(string) /* 06000007 */

  IL_0039:  nop

  IL_003a:  ldloc.0

  IL_003b:  ldstr      "" /* 70000019 */

  IL_0040:  callvirt   instance void SouthRain.Customer/*02000004*/::set_Company(string) /* 06000009 */

  IL_0045:  nop

  IL_0046:  ldloc.0

  IL_0047:  stloc.1

  IL_0048:  br.s       IL_004a

  IL_004a:  ldloc.1

  IL_004b:  ret

} // end of method TestCustomerWithConst::CreateDefaultCustomer

BUT if you look at the class definition in Reflector (or the constants declarations in ILDasm) you will see the plain text values for the constants that were suppose to be encrypted .. right? Well actually the #US strings were encrypted (we even looked at them above), it is the default values of the constant values that are not encrypted. You might be thinking "Aren't they the same"?

Not exactly. Default values for constants (which are actually literal fields in IL) get stored in the #blob heap. If you look at the blob heap you can actually see the strings that Reflector (probably) uses to recreate the code to plain text:

Blob Heap:  408(0x198) bytes

    0,0 :                                                  >                <

    1,4 : 20 01 01 0e                                      >                <

    6,4 : 20 01 01 02                                      >                <

    b,4 : 20 01 01 08                                      >                <

   10,3 : 20 00 01                                        >                <

   14,5 : 20 01 01 1d 03                                   >                <

   1a,1b: 01 08 1e 3c 3d 3e 41 3e  48 40 4a 44 40 43 42 47 >   <=>A>H@JD@CBG<

        : 49 4b 4f 47 4c 52 4f 4d  55 1f 20                >IKOGLROMU       <

   36,4 : 20 00 1d 03                                      >                <

   3b,4 : 00 01 0e 0e                                      >                <

   40,5 : 20 01 01 11 49                                   >    I           <

   46,8 : b7 7a 5c 56 19 34 e0 89                          > z\V 4          <

   4f,8 : 01 00 01 00 00 00 00 00                          >                <

   58,5 : 00 02 0e 0e 08                                   >                <

   5e,2 : 06 0e                                            >                <

   61,3 : 20 00 0e                                        >                <

   65,2 : 06 08                                            >                <

   68,3 : 06 11 0c                                        >                <

   6c,4 : 00 00 00 00                                      >                <

   71,4 : 01 00 00 00                                      >                <

   76,4 : 02 00 00 00                                      >                <

   7b,4 : 03 00 00 00                                      >                <

   80,4 : 20 00 11 0c                                      >                <

   85,5 : 20 01 01 11 0c                                   >                <

   8b,a : 4a 00 61 00 73 00 6f 00  6e 00                   >J a s o n       <

   96,a : 48 00 61 00 6c 00 65 00  79 00                   >H a l e y       <

   a1,4 : 20 00 12 10                                      >                <

   a6,3 : 28 00 0e                                        >(               <

   aa,4 : 28 00 11 0c                                      >(               <

   af,c : 01 00 07 31 2e 30 2e 30  2e 30 00 00            >   1.0.0.0      <

   bc,29: 01 00 24 33 37 33 36 31  36 34 33 2d 63 65 33 63 >  $37361643-ce3c<

        : 2d 34 35 35 65 2d 61 66  32 65 2d 37 31 66 32 38 >-455e-af2e-71f28<

        : 66 39 38 32 37 39 62 00  00                      >f98279b        <

   e6,5 : 01 00 00 00 00                                   >                <

   ec,27: 01 00 22 43 6f 70 79 72  69 67 68 74 20 c2 a9 20 >  "Copyright    <

        : 48 61 6c 65 79 20 45 6e  74 65 72 70 72 69 73 65 >Haley Enterprise<

        : 20 32 30 30 36 00 00                            > 2006           <

  114,e : 01 00 09 53 6f 75 74 68  52 61 69 6e 00 00       >   SouthRain    <

  123,15: 01 00 10 48 61 6c 65 79  20 45 6e 74 65 72 70 72 >   Haley Enterpr<

        : 69 73 65 00 00                                   >ise            <

  139,8 : 01 00 08 00 00 00 00 00                          >                <

  142,1e: 01 00 01 00 54 02 16 57  72 61 70 4e 6f 6e 45 78 >    T  WrapNonEx<

        : 63 65 70 74 69 6f 6e 54  68 72 6f 77 73 01       >ceptionThrows   <

  161,1b: 01 00 16 33 33 33 35 31  3a 31 3a 33 2e 30 2e 32 >   33351:1:3.0.2<

        : 33 34 37 2e 32 37 33 30  37 00 00                >347.27307       <

  17d,8 : 07 05 1d 03 08 08 05 05                          >                <

  186,3 : 07 01 0e                                        >                <

  18a,4 : 07 01 11 0c                                      >                <

  18f,7 : 07 03 12 10 12 10 08                            >                <

  197,0 :                                                  >                <

Now if I change the code to use readonly instead of const like this:

   41     public class TestCustomerWithReadOnly

   42     {

   43         readonly string FirstNameDefault = "Jason";

   44         readonly string LastNameDefault = "Haley";

   45         readonly string CompanyDefault = string.Empty;

   46 

   47         public Customer CreateDefaultCustomer()

   48         {

   49             Customer c = new Customer();

   50 

   51             c.FirstName = FirstNameDefault;

   52             c.LastName = LastNameDefault;

   53             c.Company = CompanyDefault;

   54 

   55             return c;

   56         }

   57     }

Reflector now shows slightly different code (with reason). The properties are now set using the readonly fields (as you can tell I didn't turn renaming on for this example).

If you look at the #US heap, you'll see it looks like it originally did. Only two strings and both are encrypted.

User Strings
-------------------------------------------------------
70000001 : ( 5) L"....."
  User string has unprintables, hex format below:
  df94 f696 ea98 f49a f39c
7000000d : ( 5) L"...."
  User string has unprintables, hex format below:
  dd94 f696 f598 fe9a e49c

If you look at the method in ILDasm you can also tell it is using the fields to set the properties ... so where are the fields initialized?

.method /*06000019*/ public hidebysig instance class SouthRain.Customer/*02000004*/

        CreateDefaultCustomer() cil managed

{

  // Code size       52 (0x34)

  .maxstack  2

  .locals /*11000004*/ init (class SouthRain.Customer/*02000004*/ V_0,

           class SouthRain.Customer/*02000004*/ V_1)

  IL_0000:  nop

  IL_0001:  newobj    instance void SouthRain.Customer/*02000004*/::.ctor() /* 06000018 */

  IL_0006:  stloc.0

  IL_0007:  ldloc.0

  IL_0008:  ldarg.0

  IL_0009:  ldfld      string SouthRain.TestCustomerWithReadOnly/*02000005*/::FirstNameDefault /* 04000011 */

  IL_000e:  callvirt   instance void SouthRain.Customer/*02000004*/::set_FirstName(string) /* 06000005 */

  IL_0013:  nop

  IL_0014:  ldloc.0

  IL_0015:  ldarg.0

  IL_0016:  ldfld      string SouthRain.TestCustomerWithReadOnly/*02000005*/::LastNameDefault /* 04000012 */

  IL_001b:  callvirt   instance void SouthRain.Customer/*02000004*/::set_LastName(string) /* 06000007 */

  IL_0020:  nop

  IL_0021:  ldloc.0

  IL_0022:  ldarg.0

  IL_0023:  ldfld      string SouthRain.TestCustomerWithReadOnly/*02000005*/::CompanyDefault /* 04000013 */

  IL_0028:  callvirt   instance void SouthRain.Customer/*02000004*/::set_Company(string) /* 06000009 */

  IL_002d:  nop

  IL_002e:  ldloc.0

  IL_002f:  stloc.1

  IL_0030:  br.s       IL_0032

  IL_0032:  ldloc.1

  IL_0033:  ret

} // end of method TestCustomerWithReadOnly::CreateDefaultCustomer

They are initialized in the constructor and the strings are encrypted! Well except for the String.Empty anyways...

.method /*0600001A*/ public hidebysig specialname rtspecialname

        instance void  .ctor() cil managed

{

  // Code size       68 (0x44)

  .maxstack  9

  .locals /*11000005*/ init (int32 V_0)

  IL_0000:  ldc.i4    0xd

  IL_0005:  stloc      V_0

  IL_0009:  ldarg.0

  IL_000a:  ldstr      bytearray (94 DF 96 F6 98 EA 9A F4 9C F3 )  /* 70000001 */

  IL_000f:  ldloc      V_0

  IL_0013:  call       string a$PST06000001(string,

                                            int32) /* 06000001 */

  IL_0018:  stfld      string SouthRain.TestCustomerWithReadOnly/*02000005*/::FirstNameDefault /* 04000011 */

  IL_001d:  ldarg.0

  IL_001e:  ldstr      bytearray (94 DD 96 F6 98 F5 9A FE 9C E4 )  /* 7000000D */

  IL_0023:  ldloc      V_0

  IL_0027:  call       string a$PST06000001(string,

                                            int32) /* 06000001 */

  IL_002c:  stfld      string SouthRain.TestCustomerWithReadOnly/*02000005*/::LastNameDefault /* 04000012 */

  IL_0031:  ldarg.0

  IL_0032:  ldsfld    string [mscorlib/*23000001*/]System.String/*0100000D*/::Empty /* 0A000013 */

  IL_0037:  stfld      string SouthRain.TestCustomerWithReadOnly/*02000005*/::CompanyDefault /* 04000013 */

  IL_003c:  ldarg.0

  IL_003d:  call       instance void [mscorlib/*23000001*/]System.Object/*01000010*/::.ctor() /* 0A000012 */

  IL_0042:  nop

  IL_0043:  ret

} // end of method TestCustomerWithReadOnly::.ctor

And if you look at the class definition in Reflector you will only see the declarations:
 

Since the values are actually set in the constructor, there are no default values stored in the #Blob heap for the readonly fields:

Blob Heap:  388(0x184) bytes

    0,0 :                                                  >                <

    1,4 : 20 01 01 0e                                      >                <

    6,4 : 20 01 01 02                                      >                <

    b,4 : 20 01 01 08                                      >                <

   10,3 : 20 00 01                                        >                <

   14,5 : 20 01 01 1d 03                                   >                <

   1a,1b: 01 0c 22 40 41 42 45 42  4c 44 4e 48 44 47 46 4b >  "@ABEBLDNHDGFK<

        : 4d 4f 53 4b 50 56 53 51  59 23 24                >MOSKPVSQY#$    <

   36,4 : 20 00 1d 03                                      >                <

   3b,4 : 00 01 0e 0e                                      >                <

   40,5 : 20 01 01 11 49                                   >    I           <

   46,2 : 06 0e                                            >                <

   49,8 : b7 7a 5c 56 19 34 e0 89                          > z\V 4          <

   52,8 : 01 00 01 00 00 00 00 00                          >                <

   5b,5 : 00 02 0e 0e 08                                   >                <

   61,3 : 20 00 0e                                        >                <

   65,2 : 06 08                                            >                <

   68,3 : 06 11 0c                                        >                <

   6c,4 : 00 00 00 00                                      >                <

   71,4 : 01 00 00 00                                      >                <

   76,4 : 02 00 00 00                                      >                <

   7b,4 : 03 00 00 00                                      >                <

   80,4 : 20 00 11 0c                                      >                <

   85,5 : 20 01 01 11 0c                                   >                <

   8b,4 : 20 00 12 10                                      >                <

   90,3 : 28 00 0e                                        >(               <

   94,4 : 28 00 11 0c                                      >(               <

   99,c : 01 00 07 31 2e 30 2e 30  2e 30 00 00            >   1.0.0.0      <

   a6,29: 01 00 24 33 37 33 36 31  36 34 33 2d 63 65 33 63 >  $37361643-ce3c<

        : 2d 34 35 35 65 2d 61 66  32 65 2d 37 31 66 32 38 >-455e-af2e-71f28<

        : 66 39 38 32 37 39 62 00  00                      >f98279b        <

   d0,5 : 01 00 00 00 00                                   >                <

   d6,27: 01 00 22 43 6f 70 79 72  69 67 68 74 20 c2 a9 20 >  "Copyright    <

        : 48 61 6c 65 79 20 45 6e  74 65 72 70 72 69 73 65 >Haley Enterprise<

        : 20 32 30 30 36 00 00                            > 2006           <

   fe,e : 01 00 09 53 6f 75 74 68  52 61 69 6e 00 00       >   SouthRain    <

  10d,15: 01 00 10 48 61 6c 65 79  20 45 6e 74 65 72 70 72 >   Haley Enterpr<

        : 69 73 65 00 00                                   >ise            <

  123,8 : 01 00 08 00 00 00 00 00                          >                <

  12c,1e: 01 00 01 00 54 02 16 57  72 61 70 4e 6f 6e 45 78 >    T  WrapNonEx<

        : 63 65 70 74 69 6f 6e 54  68 72 6f 77 73 01       >ceptionThrows   <

  14b,1b: 01 00 16 33 33 33 35 31  3a 31 3a 33 2e 30 2e 32 >   33351:1:3.0.2<

        : 33 34 37 2e 32 37 33 30  37 00 00                >347.27307       <

  167,8 : 07 05 1d 03 08 08 05 05                          >                <

  170,3 : 07 01 0e                                        >                <

  174,4 : 07 01 11 0c                                      >                <

  179,6 : 07 02 12 10 12 10                                >                <

  180,3 : 07 01 08                                        >                <

What do the examples show you about string encryption?
  1. It makes it harder (not impossible - but harder) to reverse engineer
  2. It makes the IL larger for methods (which can make the assembly file size larger)
  3. Use readonly instead of const
Those 3 items are the things I want you to remember from this entry. As with a lot of things in programming, there are tradeoffs.  It is going to be up to you and your needs to determine what is best for you.

Comments (4) | Post RSSRSS comment feed |

Categories:
Tags:

Comments

Comments are closed