Cracking Linux password has a lot to learn

My takeaways of actually cracked my linux password

Yuta Fujii
7 min readDec 26, 2021
Photo by Markus Spiske on Unsplash

Disclaimer:

This post has no intension to encourage any illegal action, learning purpose only

Motivation

This year I started to learn cyber security. In that learning path, I try to actually implement it myself so that I can remember those concepts solidly.

One of which is password cracking of Linux.

In this post I share what I learned from cracking password myself. It covers:

  • Password Cracking
  • Linux password
  • Hash function
  • Birthday Paradox
  • Encoding, Encrypting, Hashing
  • John the Ripper
  • Why GPU is good at this
  • Safety Measures

Password Cracking

The term means what it literally means. Let’s see the definition from wikipedia.

password cracking is the process of recovering passwords[1] from data that has been stored in or transmitted by a computer system in scrambled form.

Password cracking is categorized several types, which are:

Brute-force Attacks

…try every possible letters to unlock. This attack takes lots of time but theoretically every password can be cracked by this approach.

Dictionary Attacks

…make word list and try to crack by those words, and a variants of those words (like ‘password’ and ‘P@ssw0rd’)

Password List Attacks

…try pwned passwords. This is effective since many people use those words as their password.

Have I been pwned publishes passwords cracked in the past, which piled up to 600 million words as of November 2020. You can only see a hashed passwords (original words are protected for privacy reason 👍). If you do want to use the list, recommended to download by torrent, a peer to peer downloading to save bandwidth. These terms also are takeaways 😊

Linux password

Passwords of linux user are stored in /etc/shadow .

In the past they are stored in /etc/passwd , but the role has changed because this file has read access from others user.

└─# ls -lhatr /etc/passwd
-rw-r--r-- 1 root root 3.3K Jun 3 11:34 /etc/passwd
└─# ls -lhatr /etc/shadow
-rw-r----- 1 root shadow 1.9K Jun 3 11:34 /etc/shadow

(you can see ‘r’ privilege to others for /etc/passwd)

The content of /etc/shadow file is like this:

└─# cat /etc/shadow
test1:$6$coolsalt$lBwFVYyzAmmx6k3N5shu4OFCnLmzNjuFhrZLwbch8ruVxelHjD7Kl8bArJd.Ncc3nbf.4xvaEGEjolJGMp6Xf/:18781:0:99999:7:::

“lBwFVYy…p6Xf/” is the password of test1 user. It’s of course not stored plain text but stored as hash. We’ll see next.

Hash function

Generally speaking, a function that convert input text to totally random letters is called as hash function. Wikipedia says:

A hash function is any function that can be used to map data of arbitrary size to fixed-size values.

Thus, no matter how long or short input is given, hash functions return the fixed length value. This is useful. There are lots of hash functions like MD5, SHA512.

hash value of input ‘yuta’

For more precise understanding of hash function, you need to know some characteristics required to be a hash function.

1. One-directional

…impossible to calculate input from output

2. Second Preimage Resistance

…computationally infeasible to find any second input that has the same output as a given input

3. Collision Resistance

…computationally infeasible to find any pair of inputs that has the same output

Wait, tell me the difference between Second Preimage Resistance and Collision Resistance? Well, that would be clear thinking of birthday paradox.

Birthday Paradox

Guess how many people do you need to find someone who has the same birthday as you with a 50% of probability?

The answer is … 253 people.

Then, guess how many people do you need to find two people who have the same birthday with a 50% of probability?

The answer is … 23 people!

This seemingly unbelievable fact is called birthday paradox. Second Preimage Resistance is similar to the first problem, and Collision Resistance is similar to the second problem.

Some say Second Preimage Resistance and Collision Resistance as Strong Collision Resistance, Weak Collision Resistance respectively.

FYI, Digital signature susceptibility is a good read for why we have to consider Weak Collision Resistance.

Encoding, Encrypting, Hashing

The topic is a bit off the road.

A programmer we see ‘apparently random letters’ on many occasions, but can you tell the difference between encoding, encrypting and hashing?

Let me clarify them from the cracking perspective.

Encoding

…possible to revert input from output letters if you know what algorithm is used

Encrypting

…you need to know both algorithm and key to get input

Hashing

…infeasible to know input from output

What’s important here is we cannot be safe just because letters look random. It might be quite easily reversed!

For example, JWT token is composed of three parts split by comma and the second part is just a Base64-encoded string which have personal info of authenticated users.

John the Ripper

Let’s back to password cracking.

John the Ripper is an open source audit tool for password security.

It covers Unix, Mac, Windows and can audit Wordpress password security and SSH security as well.

Logo from LP

How to use

Cracking process is quite simple. First you combine shadow file and passwd file ( unshadow command) and run john command.

You may also want to use password with --wordlist= option.

└─# unshadow /etc/passwd /etc/shadow > crackme.txt  └─# john --wordlist=/usr/share/john/password.lst --rules crackme.txt

*You are encouraged to crack not in your real PC but in VM image or sandbox environment.

Run john on EC2 with GPU

Why GPU is suitable for password cracking later, but first let me explain how to do this.

This time I used g4dn.xlarge image which incorporate NVIDIA GPU. Cost friendly compared to other GPU instances, that is said, it’s $0.71/hour 😓

Cracking command is almost the same as above example. Only you need some extra preparations.

  • Install GPU driver
  • Install CUDA
  • Install John
# Check your GPU spec
lscpi | grep -i nvidia
# Install GPU driver
sudo apt install nvidia-utils-460-server
https://www.nvidia.com/Download/driverResults.aspx/169408/en-us
sudo apt-get update
sudo apt install nvidia-driver-440
sudo apt install nvidia-cuda-toolkit
sudo apt-get install build-essential libssl-dev
# Install John the Ripper
wget https://www.openwall.com/john/k/john-1.9.0-jumbo-1.tar.gz
tar xfz john-1.9.0-jumbo-1.tar.gz
cd john-1.9.0-jumbo-1/src
./configure
make -s -j 4
# Confirm installation is done properly
sudo ./john --list=opencl-devices
sudo ./john --list=formats --format=opencl

In the last command you will see:

As you are ready to crack password using GPU, here is the command to go:

unshadow /etc/passwd /etc/shadow > crackme.txt# Run john according to hashing algorithm used in password# SHA512 using GPU
sudo ./john --format=sha512crypt-opencl crackme.txt
# Other options
# MD5
sudo ./john --format=md5crypt crackme.txt
# SHA512
sudo ./john --format=sha512crypt crackme.txt

While running john, you can see your GPU is actually running by nvidia-smi command.

Why GPU is good at this

Many programmers know intuitively that this is where GPU is for. Me, too. But why? This catapulted me to study a bit further.

A key characteristic of GPU

Personally interesting description of GPU was that “CPU has pursued minimizing latency, GPU is a fruit of pursuing maximizing throughput”.

  • When we say GPU computing it means using GPGPU (general purpose GPU)
  • GPGPU is a kind of micro-architecture, characterized by having far more ALU than CPU
  • ALU (Arithmetic Logic Unit) is a place to actually execute bit operation

Of course CPU is important because it has higher clock frequency and is great in executing multi task concurrently.

GPU is good at parallelism

To put it simply, here is the explanation why GPU perform well with password cracking:

  • As such architecture, GPU is good at parallelism
  • And password cracking is suitable for parallelism since it contains tons of the same operation only changing input string
  • Thus password cracking is efficiently executed by GPU

CUDA

In the EC2 example I installed CUDA.

CUDA is an API to command GPGPU to execute tasks in parallel.

Safety Measures

Finally let’s see some aspects of this attack.

How you are compromised in real world:

  1. An attacker try phishing or RCE(remote command execution) by attacking OS command injection vulnerability, to enable him/her to execute command with root user permission
  2. Send shadow file to their remote server. Note your network routing is easily modified once attacker get root privilege!
  3. Cracking password using GPU

You see the point? Cracking is done offline. Scary is that you cannot detect cracking itself.

Protection Measures

  • Remove OS command injection vulnerability by updating your OS/library version as quick as it’s available
  • Monitor unordinary network traffic to detect 2nd process (sending file to remote server)
  • Use cracking safe passwords

Wrap up

In this post, I would rather to share how many things we can learn from a tiny idea of ‘password cracking’.

And you’ll not learn until you actually do it. Why don’t you actually give it a try? 😌

Happy legal cracking

References

--

--

Yuta Fujii

Web developer, Data analyst, Product Manager. Ex investment banker( structured finance ). Learn or Die.