You are on page 1of 22

Final Year Project Presentation

Reverse Engineering Password Hashes


By Aditya Sambamoorthy Supervisor: Prof Bharadwaj Veeravalli Project done under A-star under the tutelage and guidance of Dr. SPT Krishnan and Mr. Chee Hoo

Project Overview
comparative analysis on four existing pre-computed table matrices for reversing crypto- graphic functions in realworld password-cracking applications

Algorithm was implemented in CUDA-C on an NVIDIA Kepler Based GPU Architecture (Tesla)

Involved use of effective load balancing measures, multithreading techniques and algorithmic enhancements.

Hashing
Maps a string of any length(message) to an output of fixed length(digest)

Minimal Properties Pre-Image Resistance Second pre-image resistance Collision resistance

MD5 Algorithm
MD5 takes as input a message of arbitrary length and produces as output a 128-bit fingerprint or message digest of the input.

MD5 involves the following steps: Padding Append length Initialize MD5 Buffer Process in 16 word blocks Output

Time Memory Trade-Off


There are generally two basic cryptanalytic techniques that can be applied to any cryptosystem independent of its cryptographic design exhaustive key search and dictionary attack.

TMTO strikes a compromise between the time complexity of an exhaustive key search and the memory complexity of a dictionary attack. Successful cryptanalytic attacks such as Hellman Tables use a TMTO based approach.

NVIDIA Kepler Architecture


They greatly outpace CPUs in arithmetic throughput and memory band- width Improved context switching latency , huge leap in power efficiency Dynamic Parallelism, Hyper-Q and Grid Management Unit to enable increased GPU utilization and simplify parallel program design.

CUDA
Extension to C A kernel executes in parallel across a set of parallel threads. The programmer or compiler organizes these threads in thread blocks and grids of thread blocks. The GPU instantiates a kernel program on a grid of parallel thread blocks. Each thread has a thread ID and each thread block has a block ID within its grid.

Hellman Tables

No of tables=t Length of 1 chain = t f i = f XOR i mt2 = N : no of hash values

Constructing the Tables


Considering an input space of N, an encryption chain length of t and m starting points obeying N=m.t: Chosen starting points(m) are denoted by SPi where i={0,..N-1} For every x=i xi,0=SPi and within an encryption chain denoting one row of the table every subsequent link in the chain is generated by xi,j=f(xi,j-1) EPi denotes the end point of the corresponding SPi stored as (SPi,EPi) as a row in the final hellman tables and EPi = ft(SPi)

Implementation in CUDA
In our cryptanalysis experiment using Hellman tables, we have m = 500, 000, 000 and t = 10, 000 to cover a total input space of 5 billion passwords. Each f in the encryption chain mentioned earlier was a combination of MD5 Hashing, XOR encryption and character selection Chain generation operation was then parallelized across tables and each chain was made an independently computed unit. Use optimal number of threads ,blocks and grids and by using appropriate indexing techniques. Sample function call:

precomputeOnDevice<<<gridDim,threadsPerBlock>>>(startin

Start from SP4 and forward compute the chain till the next value is X

Hellman Attack
Step 3: Compare f(f(X)) with EPs

SP0 SP1 SP2 SP3 SP4

EP0 EP0

EP2
EP2

EP4

Let the hash be H and on reduction and selection be X

Step 1: Compare X with EPs Step 2: Compare f(X )with EPs

Implementation
Highly-parallelized to fully optimize the multi-threaded capability of the 3072 CUDA cores in each GTX 690 card. Physical bandwidth limit of 1 Gigabyte (GB) data transfer link between the GPU and the systems main memory Bottleneck.

Pre-computed values are loaded into the CPU and loaded segment by segment into the GPU.
Endpoint values in Hellman tables are sorted in ascending order across all the tables to facilitate the use of binary search.

Collisions and False Positives


SP
SP0 SP1 SP2 SP3 SP4

EP
EP0
EP0 EP2 EP2 EP4

Collisio n

EP2

Chain Merger
False Positive

Experiment
A list of 10,000 passwords was selected which were then reduced to 7 characters in length. These passwords were reverse engineered from the leaked list of unsalted SHA1 hashes as part of the 2012 LinkedIn attack. The results of this program were then recorded in the format of the time needed to process a password hash, the average number of collisions/false-positives, the average time taken to find a password and so on.

Results : Accuracy

Results : Average Collisions

Results : Average time

Results : Cumulative Frequency

Possible Improvements
Reducing collisions Convenient data structures such as a red black tree or a dictionary supporting fast indexing operations store hitherto computed chains within a table so that when a collision is detected redundant computational effort can be spared Distinguished Check Points Speeds up chain regeneration and also reduces time taken to resolve the false positives.

THANK YOU

Q&A

You might also like