Fernet is a useful tool in the arsenal of a Python developer. It aims to help them secure data without running into all of the risks that come with implementing cryptographic primitives yourself.
If you are interested in implementing fernet, you may be wondering whether it’s a suitable option for encrypting and authenticating messages. In this article, we’ll examine how fernet encrypts and decrypts data, highlighting its strengths as well as its limitations.
What is fernet?
Fernet is a recipe that provides symmetric encryption and authentication to data. It is a part of the cryptography library for Python, which is developed by the Python Cryptographic Authority (PYCA).
There are a range of different use cases for Fernet. Real-world examples include:
- Apache Airflow — This workflow monitoring and scheduling platform implements fernet to encrypt passwords for both the variable configuration and the connection configuration. This helps to keep the passwords safe from attackers.
- Red Hat’s Overcloud — Fernet keys can be used to provide encryption in Red Hat’s Overcloud, which is the company’s OpenStack Platform environment for creating and administering network resources in private and public clouds.
- Databricks — Fernet can play a role in securing personally identifiable information alongside other tools like Databricks. This information is prized by hackers, so it’s important to have secure encryption and authentication mechanisms like fernet to protect it.
What is a recipe?
Recipes are small python scripts that are used for solving common problems.
What is the Python Cryptographic Authority (PYCA)?
The Python Cryptographic Authority—PyCA for short—is not quite as official as its name makes it seem. Although there is no official licensing body that oversees it, PyCA is a loosely knit team of developers who work together to solve common cryptography problems in the language. Their work aims to make it simpler for coders to secure their applications.
What is cryptography?
cryptography is a package from the Python Cryptographic Authority that makes it easier and safer for Python developers to deploy the necessary cryptographic mechanisms within their software. The components of cryptography can be divided into two levels.
One is the high-level recipes, which require developers to make very few decisions in the implementation process. Because there are so few implementation decisions, there are fewer places for a developer to screw up, which ultimately makes these high-level recipes far safer than if a programmer were to attempt to cobble together all of the algorithms on their own.
Given how complex cryptography is and how easy it is for someone to open up their app to side-channel attacks, these recipes help to take out the guesswork and lead to a much safer digital ecosystem.
The other major component of cryptography is its collection of cryptographic primitives. In contrast to the high-level recipes that are basically ready-to-go, these primitives are more like cryptographic building blocks, which need to be pieced together in the right way in order to fully meet the security needs of a given application.
These cryptographic primitives should only be implemented when the high-level recipes are unsuitable for a given task, and they must be approached with extreme care. This is because there are a lot more choices for a developer to make when implementing these cryptographic primitives, which means that there are many more places for things to go wrong. It’s best to avoid these unless you have a lot of experience in the field of cryptography. The PyCA developers even refer to them as “hazardous materials”, so you know you have to be careful whenever you handle them.
cryptography vs other Python cryptographic libraries
If you’ve been searching for resources to help you with cryptographic implementations in Python, you may have noticed libraries such as PyOpenSSL, M2Crypto and PyCrypto.
The PyCA developers state that they built the cryptography package to address some of the problems that they found in these legacy libraries. Among the issues in competing libraries, they found:
- A lack of useful algorithms like HKDF and AES-GCM.
- A lack of maintenance.
- APIs that were error prone and had insecure default options.
- Limited Python 3 and PyPy support.
- A lack of high-level APIs
- Some algorithms were implemented poorly. In some cases, this potentially opened the door to side-channel attacks.
Another option you may have found is NaCl, which is built by a team that includes some of the major figures in the world of cryptography. NaCl is pronounced “salt”, for those who enjoy chemistry-related wordplay.
Much like the cryptography package, the aim of NaCl is to simplify cryptography and make it easier for developers to securely implement these algorithms. If you are more inclined toward NaCl’s design, the PyCA team also maintain PyNaCl, so you can trust it just as much as cryptography.
How does fernet encrypt and authenticate data?
As we stated, fernet is a recipe from the cryptography package that you can use to encrypt and authenticate data. As a recipe, it’s relatively easy to deploy, and it takes out much of the guesswork involved in cryptography.
When fernet is implemented correctly, an attacker can’t read or meddle with a message that has been encrypted and authenticated with it.
There are three important inputs to fernet, alongside a randomly chosen initialization vector:
- A plaintext message provided by the user. This is the data that the user wants to encrypt, and it takes the form of an arbitrary sequence of bytes.
- A user-supplied key, which is 256 bits in length.
- The current time.
When these inputs are run through the fernet recipe, it produces a token. This contains the message in encrypted form, as well as an HMAC (we’ll explain what this is further down the page) that authenticates the message. Together, these prevent the message from being read or altered without the key.
Fernet’s cryptographic primitives
Fernet is built from several cryptographic primitives. These are the “hazardous materials” building blocks we mentioned before. However, in fernet’s recipe, they have been assembled together by specialists. Using fernet instead of tying together the cryptographic primitives yourself can help you avoid the many pitfalls that come with implementing your own crypto.
As confident as you may be in your skills, cryptography is a minefield, and it’s always best to leave it to the experts whenever you can. There’s no need to reinvent the wheel, especially when reinventing the wheel could lead to a major security disaster.
Fernet’s cryptographic primitives include:
- 128-bit AES in CBC mode.
- An HMAC with SHA-256.
Don’t worry, we will explain what each of these terms mean in the coming paragraphs.
What is AES?
One of the cryptographic primitives is AES, Advanced Encryption Standard. Its was developed and standardized by the National Institute of Standards and Technology in collaboration with the wider cryptographic community.
It’s a fast, symmetric-key encryption algorithm that we use all the time to encrypt data in storage and in transit. You may never really notice it in your daily browsing, but if you are connecting to this website by HTTPS (look for the lock icon on the left of your address bar), then the information traveling between our server and your web browser is most likely secured by AES.
AES has three different key sizes, 128 bits, 192 bits and 256 bits. However, fernet only uses 128-bit AES keys. 128-bit AES is considered more than secure enough for the vast majority of applications.
What is CBC mode?
Before we can explain what CBC mode is, we need to back up a bit and explain why we need it in the first place. AES is a block cipher. This basically means that the algorithm processes the data in fixed-length chunks, which are known as blocks. If you have a large amount of data, it cannot be encrypted all at once. First, it needs to be divided into blocks that have a fixed length, and each block is processed separately.
A block cipher like AES can only securely encrypt one block of data on its own. AES has a block size of 128 bits, which means that it can only process a maximum of 128-bits. This is a relatively small amount of data, so for many practical applications, we need a way to securely encrypt multiple blocks of data together.
There are a number of ways we can do this, each of which are known as modes of operation. These modes of operation specify how a cipher like AES will securely encrypt multiple blocks of data, allowing AES to be used to encrypt and decrypt much larger volumes of data.
CBC refers to cipher block chaining, which is an older method for encrypting multiple blocks of data together. It begins with three inputs:
- The plaintext that a user wants to encrypt.
- An initialization vector (IV), which is an input that is only used for the first block. Subsequent blocks use the output from the previous block as the input instead. Since there are no blocks before the first one, we need another input that can provide the initial state. This is where initialization vectors step in. In the case of fernet, a new, randomly-chosen initialization vector must be used for each token.
- A key that helps to provide confidentiality to the data.
The CBC mode encryption process plays a critical role in fernet’s ability to encrypt larger amounts of data. CBC encryption by WhiteTimberwolf.
The plaintext is added to the initialization vector through an XOR operation, which is a logical operation that basically performs a special kind of addition at the binary level. These operations are a little too complicated to explain right here, but this Introduction to Boolean Algebra link should give you the background to be able to understand how XOR addition works.
The result of the XOR operation is then run through the many steps of the AES block cipher, alongside the key. This produces a block of ciphertext, which is then XORed with the next block of the plaintext that needs to be encrypted. This means that in the second block, as well as all subsequent blocks, the output from the previous block acts in much the same way as the initialization vector did for the first block.
CBC mode decryption follows a similar, yet somewhat reversed process. The first block of ciphertext is run through the decryption cipher alongside the key. The same initialization vector is XORed with the result, revealing the first block of plaintext. The second and all subsequent blocks follow a similar decryption process, except the previous block’s ciphertext takes the place of the initialization vector.
The CBC mode decryption process plays a critical role in fernet’s ability to encrypt larger amounts of data. CBC decryption by WhiteTimberwolf.
The weaknesses of CBC mode
CBC is an old mode of operation, and it’s not without its downsides. For one, CBC is relatively slow, because it’s a sequential algorithm. The output of the previous block is required before you can begin computing the next block, which means that you cannot encrypt multiple blocks in parallel. Ultimately, this makes it slower than some of the rival modes of operation that allow for parallelization.
CBC can also be vulnerable to a range of attacks if it isn’t deployed correctly, such as padding oracle attacks. This is why it’s important for AES-CBC to be implemented alongside appropriate padding schemes, the right initialization vectors, and secure authentication mechanisms.
Another issue is that AES-CBC lacks in-built authentication. This means that if you want to be able to tell whether data has been meddled with, it needs to be implemented alongside additional authentication mechanisms. Fernet solves this problem by using an HMAC, which we will describe later.
What is padding?
The AES block cipher processes data in 128-bit blocks. But what happens if we have 50 bits of data, 135 bits of data, or 587 bits of data?
We fill up the extra space with padding.
At its core, padding involves adding extra data to ensure that the input is the right size so that an algorithm like AES can process it. In the case of 50 bits of data, we would need to pad it with an extra 78 bits of data to bring it up to one full 128-bit block. With 135 bits of data, we would have 7 bits left over from the first 128-bit block, meaning that we would need an extra 121 bits to pad out our second block.
If we had 587 bits of data that we needed to encrypt, the first 512 bits would be split over four 128-bit blocks. The remaining 75 bits would be left over for a fifth block, which would need an extra 53 bits of padding to fill it up to a 128-bit block size. This means that we would have five 128-bit blocks, with a total of 640 bits.
Padding actually accomplishes more than just filling up empty space. Padding schemes can help to prevent attacks by hiding the length of the message that is encrypted. Fernet pads data with a scheme known as PKCS#7.
Note that CBC mode doesn’t always require padding. In other situations, ciphertext stealing may be appropriate. However, fernet uses PKCS7 padding, so we won’t bother going off on a tangent to explain the wonders of ciphertext stealing.
What is an HMAC?
Now that we know how fernet uses AES to encrypt data, it’s time to talk about how this data is authenticated. Earlier, we mentioned that AES-CBC doesn’t have in-built authentication. While this is a bit of a hassle, it’s not an insurmountable problem. It just means that we need to add in extra authentication measures in order for AES-CBC to be secure.
Fernet accomplishes its authentication through an HMAC that uses the SHA-256 hash function. HMAC stands for hash-based message authentication code, which is a specific type of message authentication code (MAC). HMACs use secret keys and hash functions in a special way so that a recipient can verify the authenticity and integrity of data that they have received.
While fernet’s setup of AES-CBC alongside an SHA-256 HMAC can provide the trifecta of confidentiality, integrity and authenticity, it’s kind of an old-fashioned way of doing it.
CBC-mode was initially developed in the seventies, and while it has served us well, there are other options that can also fill the role. One example is AES-GCM, which can both encrypt and authenticate data. This means that if AES-GCM is implemented, there is no need for an additional authentication mechanism like an HMAC.
The cryptography library even includes cryptographic primitives for AES-GCM. However, they are part of the hazardous materials layer, so only those with expertise should implement it.
Despite the fact that AES-GCM includes authentication, there are still some pitfalls that an inexperienced developer could fall into if they were to implement the cryptographic primitives themselves. One example is reusing a nonce with a given key. This can undermine the security of any message that is protected by that nonce and key pair.
What is a fernet token?
When a user wants to encrypt a message with fernet, the recipe takes the plaintext message, a key and a timestamp as inputs, and then uses this information to produce a token. The token includes an encrypted version of the plaintext message, as well as the information needed to verify the integrity of the message. This means that the token provides confidentiality, integrity and authentication to the plaintext message.
Fernet tokens are JavaScript Object Notation (JSON) files, which is a language-independent file format used for data interchange. The tokens are encoded according to the base64url specification, which is a way to encode characters that is safe for filenames.
A fernet token’s base64url encoding is a concatenation of the following fields:
- Version — The version field is an 8-bit number that specifies which version of fernet will be used. There is only currently one defined version, and its value is 128 in decimal, or 0x80 in hexadecimal.
- Timestamp — The timestamp is a 64-bit integer. It is both unsigned and big-endian. The 64-bit integer is a timestamp of the number of seconds that have passed since January 1, 1970 UTC, and the date of the fernet token’s creation.
- Initialization Vector — In order for fernet to be secure, a unique initialization vector must be used for the AES encryption. These must be 128-bit numbers. Fernet generates random numbers with os.urandom() in Python. This uses entropy sources from the operating system to ensure that the unique number is sufficiently random. This prevents an attacker from being able to predict the random number output.
- Ciphertext — The ciphertext field will vary in length, according to the size of the initial plaintext message that has been encrypted. It will always be a multiple of the 128-bit AES block size, with padding used to fill up the last block.
- HMAC — The HMAC covers each of the above four fields. This means that the version, timestamp, initialization vector and ciphertext have been authenticated by the HMAC, allowing a recipient to verify the integrity and authenticity of this data. The above four fields are concatenated and then run through an SHA-256 HMAC. Note that the inputs to the HMAC are not base64url encoded.
The fernet encryption and authentication process
The fernet encryption and authentication process unfolds along the following steps:
- The timestamp is recorded.
- os.urandom() is used to generate a unique and sufficiently random initialization vector.
- The ciphertext is constructed:
a. The plaintext is padded out according to PKCS #7 so that each block is 128-bits.
b. The padded message is encrypted with 128-bit AES in CBC mode, using an encryption key supplied by the user, as well as the initialization vector generated by os.urandom(). - An HMAC is computed for the version, timestamp, initialization vector and ciphertext fields.
- All of the above fields, plus the HMAC are concatenated together.
- The token is encoded according to the base64url specification.
Verifying the fernet token
A user can verify the token and decrypt it if they also have the secret key. This allows them to access the message, while ensuring that it maintains its authenticity and integrity. The process includes the following steps:
- Reverse the base64url encoding of the token.
- Check that the token’s first byte is 0x80 (this is 128 in decimal. It tells you the version of fernet that is being used).
- If the token has a maximum age, verify that the token isn’t too old.
- Compute the HMAC from the version, timestamp, initialization vector and ciphertext fields. This requires the user-supplied key.
- Check that this computed timestamp matches the timestamp that is included in the token.
- Use the encryption key and the initialization vector to decrypt the AES-CBC ciphertext.
- Remove the padding of the decrypted message. This gives you the original plaintext.
If the verification process fails, fernet will give the user an invalid token message, alongside a description that states why the process failed.
Using passwords with fernet
Users can opt to use passwords in fernet for data protection. However, first, the password needs to be run through a key derivation function like bcrypt, Scrypt, or PBKDF2HMAC. The salt must be stored in a location where fernet can easily retrieve it, otherwise it will not be able to derive the key from the password in future attempts.
The limitations of fernet
Fernet is designed so that it doesn’t expose unauthenticated bytes. Because of this, the entire message contents must be able to fit in the available memory. This makes fernet unsuitable for encrypting very large files.
Overall, fernet is a useful tool for helping Python developers to secure smaller amounts of data. It takes a lot of the dangerous guesswork out of implementing cryptography, making fernet ideal for those who lack expertise in the field.
While fernet serves a useful purpose, there are also alternatives that can play similar roles. One of these is AES-GCM, which has inbuilt encryption and authentication measures. This contrasts with fernet’s AES-CBC mode of operation, which requires an additional HMAC for authentication.
L’article What is fernet and when should you use it? est apparu en premier sur Comparitech.
0 Commentaires