PPPP

Final Year Project Presentation
Reverse Engineering Password Hashes

By Aditya Sambamoorthy Supervisor: Prof Bharadwaj Veeravalli Project done under A-star under the tutelage and guidance of Dr. SPT Krishnan and Mr. Chee Hoo
Project Overview
comparative analysis on four existing pre-computed table matrices for reversing cryptographic functions in realworld password-cracking applications
Algorithm was implemented in CUDA-C on an NVIDIA Kepler Based GPU Architecture (Tesla)
Involved use of effective load balancing measures, multithreading techniques and algorithmic enhancements.
Hashing
Maps a string of any length(message) to an output of fixed length(digest)
Minimal Properties Pre-Image Resistance Second pre-image resistance Collision resistance
MD5 Algorithm
MD5 takes as input a message of arbitrary length and produces as output a 128-bit fingerprint or message digest of the input.
MD5 involves the following steps: Padding Append length Initialize MD5 Buffer Process in 16 word blocks Output
Time Memory Trade-Off

There are generally two basic cryptanalytic techniques that can be applied to any cryptosystem independent of its cryptographic design exhaustive key search and dictionary attack.
TMTO strikes a compromise between the time complexity of an exhaustive key search and the memory complexity of a dictionary attack. Successful cryptanalytic attacks such as Hellman Tables use a TMTO based approach.
NVIDIA Kepler Architecture

They greatly outpace CPUs in arithmetic throughput and memory bandwidth Improved context switching latency , huge leap in power efficiency Dynamic Parallelism, Hyper-Q and Grid Management Unit to enable increased GPU utilization and simplify parallel program design.
CUDA
Extension to C A kernel executes in parallel across a set of parallel threads. The programmer or compiler organizes these threads in thread blocks and grids of thread blocks. The GPU instantiates a kernel program on a grid of parallel thread blocks. Each thread has a thread ID and each thread block has a block ID within its grid.
Hellman Tables
No of tables=t Length of 1 chain = t f i = f XOR i mt2 = N : no of hash values
Constructing the Tables

Considering an input space of N, an encryption chain length of t and m starting points obeying N=m.t: Chosen starting points(m) are denoted by SPi where i={0,..N-1} For every x=i xi,0=SPi and within an encryption chain denoting one row of the table every subsequent link in the chain is generated by xi,j=f(xi,j-1) EPi denotes the end point of the corresponding SPi stored as (SPi,EPi) as a row in the final hellman tables and EPi = ft(SPi)
Implementation in CUDA
In our cryptanalysis experiment using Hellman tables, we have m = 500, 000, 000 and t = 10, 000 to cover a total input space of 5 billion passwords. Each f in the encryption chain mentioned earlier was a combination of MD5 Hashing, XOR encryption and character selection Chain generation operation was then parallelized across tables and each chain was made an independently computed unit. Use optimal number of threads ,blocks and grids and by using appropriate indexing techniques. Sample function call:
precomputeOnDevice<<<gridDim,threadsPerBlock>>>(startin
Start from SP4 and forward compute the chain till the next value is X
Hellman Attack
Step 3: Compare f(f(X)) with EPs
SP0 SP1 SP2 SP3 SP4
EP0 EP0
EP2
EP2
EP4
Let the hash be H and on reduction and selection be X
Step 1: Compare X with EPs Step 2: Compare f(X )with EPs
Implementation
Highly-parallelized to fully optimize the multi-threaded capability of the 3072 CUDA cores in each GTX 690 card. Physical bandwidth limit of 1 Gigabyte (GB) data transfer link between the GPU and the systems main memory Bottleneck.
Pre-computed values are loaded into the CPU and loaded segment by segment into the GPU.
Endpoint values in Hellman tables are sorted in ascending order across all the tables to facilitate the use of binary search.
Collisions and False Positives

SP
SP0 SP1 SP2 SP3 SP4
EP
EP0
EP0 EP2 EP2 EP4
Collisio n
EP2
Chain Merger
False Positive
Experiment
A list of 10,000 passwords was selected which were then reduced to 7 characters in length. These passwords were reverse engineered from the leaked list of unsalted SHA1 hashes as part of the 2012 LinkedIn attack. The results of this program were then recorded in the format of the time needed to process a password hash, the average number of collisions/false-positives, the average time taken to find a password and so on.
Results : Accuracy
Results : Average Collisions
Results : Average time
Results : Cumulative Frequency
Possible Improvements
Reducing collisions Convenient data structures such as a red black tree or a dictionary supporting fast indexing operations store hitherto computed chains within a table so that when a collision is detected redundant computational effort can be spared Distinguished Check Points Speeds up chain regeneration and also reduces time taken to resolve the false positives.
THANK YOU
Q&A

PPPP

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

PPPP

Uploaded by

Copyright:

Available Formats

Final Year Project Presentation

Reverse Engineering Password Hashes

Minimal Properties Pre-Image Resistance Second pre-image resistance Collision resistance

Time Memory Trade-Off

NVIDIA Kepler Architecture

No of tables=t Length of 1 chain = t f i = f XOR i mt2 = N : no of hash values

Constructing the Tables

SP0 SP1 SP2 SP3 SP4

Let the hash be H and on reduction and selection be X

Step 1: Compare X with EPs Step 2: Compare f(X )with EPs

Collisions and False Positives

Results : Average Collisions

Results : Average time

Results : Cumulative Frequency

You might also like