Archive for November, 2012

PBKDF2 – Pure Java Implementation

Here’s my PBKDF2 implementation. It’s a clean-room implementation straight from RFC 2898. It passes all RFC 6070 test vectors (testing code included).

It’s pure java with no weird requirements or external libraries. It performs reasonably well compared to the stock Java implementation of PBKDF2WithHmacSHA1; sometimes it beats it, sometimes it doesn’t. I’m sure there are ways to optimize performance.

Anyway, enjoy…

/*
 * Copyright (c) 2012
 * Cole Barnes [cryptofreek{at}gmail{dot}com]
 * https://cryptofreek.org/
 *
 * Permission is hereby granted, free of charge, to any person obtaining a copy
 * of this software and associated documentation files (the "Software"), to deal
 * in the Software without restriction, including without limitation the rights
 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 * copies of the Software, and to permit persons to whom the Software is
 * furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be included in
 * all copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 * SOFTWARE.
 * 
 * -----------------------------------------------------------------------------
 * 
 * This is a clean-room implementation of PBKDF2 using RFC 2898 as a reference.
 * 
 * RFC 2898:
 * http://tools.ietf.org/html/rfc2898#section-5.2
 * 
 * This code passes all RFC 6070 test vectors:
 * http://tools.ietf.org/html/rfc6070
 * 
 * The function "nativeDerive()" is supplied as an example of the native Java 
 * PBKDF2WithHmacSHA1 implementation.  It is used for benchmarking and 
 * comparison only.
 * 
 * The functions "fromHex()" and "toHex()" came from some message board
 * somewhere.  No license was included.
 * 
 */

import java.io.ByteArrayOutputStream;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.security.spec.KeySpec;
import java.util.Formatter;

import javax.crypto.Mac;
import javax.crypto.SecretKey;
import javax.crypto.SecretKeyFactory;
import javax.crypto.spec.PBEKeySpec;
import javax.crypto.spec.SecretKeySpec;

public class CPbkdf2
{
  /* START RFC 2898 IMPLEMENTATION */
  public static byte[] derive(String P, String S, int c, int dkLen)
  {
    ByteArrayOutputStream baos = new ByteArrayOutputStream();

    try
    {
      int hLen = 20;

      if (dkLen > ((Math.pow(2, 32)) - 1) * hLen)
      {
        System.out.println("derived key too long");
      }
      else
      {
        int l = (int) Math.ceil((double) dkLen / (double) hLen);
        // int r = dkLen - (l-1)*hLen;

        for (int i = 1; i <= l; i++)
        {
          byte[] T = F(P, S, c, i);
          baos.write(T);
        }
      }
    }
    catch (Exception e)
    {
      e.printStackTrace();
    }

    byte[] baDerived = new byte[dkLen];
    System.arraycopy(baos.toByteArray(), 0, baDerived, 0, baDerived.length);

    return baDerived;
  }

  private static byte[] F(String P, String S, int c, int i) throws Exception
  {
    byte[] U_LAST = null;
    byte[] U_XOR = null;

    SecretKeySpec key = new SecretKeySpec(P.getBytes("UTF-8"), "HmacSHA1");
    Mac mac = Mac.getInstance(key.getAlgorithm());
    mac.init(key);

    for (int j = 0; j < c; j++)
    {
      if (j == 0)
      {
        byte[] baS = S.getBytes("UTF-8");
        byte[] baI = INT(i);
        byte[] baU = new byte[baS.length + baI.length];

        System.arraycopy(baS, 0, baU, 0, baS.length);
        System.arraycopy(baI, 0, baU, baS.length, baI.length);

        U_XOR = mac.doFinal(baU);
        U_LAST = U_XOR;
        mac.reset();
      }
      else
      {
        byte[] baU = mac.doFinal(U_LAST);
        mac.reset();

        for (int k = 0; k < U_XOR.length; k++)
        {
          U_XOR[k] = (byte) (U_XOR[k] ^ baU[k]);
        }

        U_LAST = baU;
      }
    }

    return U_XOR;
  }

  private static byte[] INT(int i)
  {
    ByteBuffer bb = ByteBuffer.allocate(4);
    bb.order(ByteOrder.BIG_ENDIAN);
    bb.putInt(i);

    return bb.array();
  }
  /* END RFC 2898 IMPLEMENTATION */

  /* START HELPER FUNCTIONS */
  private static String toHex(byte[] ba)
  {
    String strHex = null;

    if (ba != null)
    {
      StringBuilder sb = new StringBuilder(ba.length * 2);
      Formatter formatter = new Formatter(sb);

      for (byte b : ba)
      {
        formatter.format("%02x", b);
      }

      formatter.close();
      strHex = sb.toString().toLowerCase();
    }

    return strHex;
  }

  private static byte[] nativeDerive(String strPassword, String strSalt, int nIterations, int nKeyLen)
  {
    byte[] baDerived = null;

    try
    {
      SecretKeyFactory f = SecretKeyFactory.getInstance("PBKDF2WithHmacSHA1");
      KeySpec ks = new PBEKeySpec(strPassword.toCharArray(), strSalt.getBytes("UTF-8"), nIterations, nKeyLen * 8);
      SecretKey s = f.generateSecret(ks);
      baDerived = s.getEncoded();
    }
    catch (Exception e)
    {
      e.printStackTrace();
    }

    return baDerived;
  }
  /* END HELPER FUNCTIONS */

  public static void runTestVector(String P, String S, int c, int dkLen, String strExpectedDk)
  {
    System.out.println("Input:");
    System.out.println("  P = \"" + P + "\"");
    System.out.println("  S = \"" + S + "\"");
    System.out.println("  c = " + c);
    System.out.println("  dkLen = " + dkLen);
    System.out.println();

    long nStartDk = System.nanoTime();
    byte[] DK = derive(P, S, c, dkLen);
    long nStopDk = System.nanoTime();
    
    long nStartDkNative = System.nanoTime();
    byte[] DK_NATIVE = nativeDerive(P, S, c, dkLen);
    long nStopDkNative = System.nanoTime();

    System.out.println("Output:");
    System.out.println("  DK          = " + toHex(DK));
    System.out.println("  DK_NATIVE   = " + toHex(DK_NATIVE));
    System.out.println("  DK_EXPECTED = " + strExpectedDk.replaceAll(" ", ""));
    System.out.println();

    System.out.println("Duration [my implementation]:      " + (nStopDk - nStartDk) + " ns" );
    System.out.println("Duration [native implementation]:  " + (nStopDkNative - nStartDkNative) + " ns" );
    
    System.out.println("---------------------------------------------------------------");
    System.out.println();
  }

  public static void RFC6070()
  {
    runTestVector("password", "salt", 1, 20, "0c 60 c8 0f 96 1f 0e 71 f3 a9 b5 24 af 60 12 06 2f e0 37 a6");
    runTestVector("password", "salt", 2, 20, "ea 6c 01 4d c7 2d 6f 8c cd 1e d9 2a ce 1d 41 f0 d8 de 89 57");
    runTestVector("password", "salt", 4096, 20, "4b 00 79 01 b7 65 48 9a be ad 49 d9 26 f7 21 d0 65 a4 29 c1");
    runTestVector("password", "salt", 16777216, 20, "ee fe 3d 61 cd 4d a4 e4 e9 94 5b 3d 6b a2 15 8c 26 34 e9 84");
    runTestVector("passwordPASSWORDpassword", "saltSALTsaltSALTsaltSALTsaltSALTsalt", 4096, 25, "3d 2e ec 4f e4 1c 84 9b 80 c8 d8 36 62 c0 e4 4a 8b 29 1a 96 4c f2 f0 70 38");
    runTestVector("pass\0word", "sa\0lt", 4096, 16, "56 fa 6a a7 55 48 09 9d cc 37 d7 f0 34 25 e0 c3");
  }

  public static void main(String[] args)
  {
    RFC6070();
  }
}

RFC 6070 Test Vectors and RFC 2898 implementations that enforce salt lengths.

UGH!!!

A little background: RFC 2898 defines (among other things) the PBKDF2 algorithm for generating encryption keys based on a given password. The details are unimportant, but it produces a salted password hash that is computationally impractical to crack given a sufficiently long salt over enough iterations. RFC 2989 “recommends” an iteration count of 1000 or greater (section “4.2 Iteration Count”). It also says that salts “should” be at least 8 octets (or 64 bits) long (section “4.1 Salt”).

When reading RFC’s, terminology is extremely important in implementation. The words “should” and “recommend” are very different from the word “must” (check out RFC 2119). Things that say “should” are not required to conform to the spec; however, “must” explicitly defines an absolute requirement.

In the world of cryptography, “should” things are very important. Your algorithm may not be as secure as you think if you ignore the “should” stuff, even though you have implemented everything as defined in the RFC. In the case of the PBKDF2 algorithm, simply don’t use salts less than 8 octets and less iterations then 1000. Simple as that. It’s not required by the RFC, it’s just smart.

From a developer’s standpoint, the question always looms: Do I give my users enough rope to hang themselves? When implementing PBKDF2, should I even allow salts of less than 64 bits or small iteration counts? For super secure systems, the answer probably should be “no”.

Now, as a general rule, it’s a bad idea attempt to code your own hashing algorithms. There are only a handful of people on the planet who have a firm enough grasp of modern cryptography to understand all the intricacies and implications of improperly coded cryptographic algorithms. I am not one of those people, and odds are, neither are you. So just don’t do it, especially if there’s a tried-and-true implementation available.

With that said, if you are going to attempt to code your own algorithms, there are certain test vectors that you can run through your code to make sure everything works properly. For PBKDF2 derived keys, those test vectors are defined in RFC 6070. If you are testing PBKDF2 with HmacSHA1, this is the set of data you use to test.

Sooooooooo… I’ve been playing around with PBKDF2 a bit, trying my hand at coding the algorithm in various languages, and testing the implementations that exist in various frameworks. I know I just told you not to do such things, but in fairness I’m not trying to generate the HMACs myself. PBKDF2 isn’t a hashing alogrithm, it takes a bunch of hashes and squarshes them together in such a way that would be extremely difficult and time consuming to crack the key through traditional means. And yes, I said “squarshes”.

Anyway, in testing I find that there are frameworks out there that enforce the recommended salt length specified in RFC 2898. They are incapable of running 5 out of the 6 test vectors in RFC 6070 because their salt is “salt” or “sa\0lt”; both of which are less than 64 bits in length. Like I said “UGH!!!”.

I know this is relatively minor. You can always implicitly test the validity of algorithms by running the same test data through multiple known/working implementations. It’s just the principle of the thing, ya know?