CVE-2023-36053: Potential ReDoS in EmailValidator/URLValidator

CVE-2023-36053: Potential ReDoS in EmailValidator/URLValidator
Enoshima railway, 2023 ๐Ÿ‡ฏ๐Ÿ‡ต

Hi! This is Seokchan Yoon studying Offensive Security in Korea. ๐Ÿ‡ฐ๐Ÿ‡ท

__

1. Introduction

On July 3rd, 2023, my first CVE was published!

It's me haha ๐Ÿ˜

You can briefly check the details of the vulnerability that I discovered in the links above. In this article, I would like to share the flow of my thought about how I could find this vulnerability (although I'm not a great player or what I found is not a high-impact vulnerability ๐Ÿ˜…). And I'll share my thoughts on how to read the code as an offensive security researcher.

In this article, rather than focusing on code analysis, I just wrote the root cause of vulnerability in words. Then let's start.

__

2. Analyzing the core functions

If you're an offensive researcher, you'll all agree. In order to analyze any software or open-sourced code, the core functions needs to be analyzed first. The Django's main functions that I thought are as follows.

  • The entry point that loads Django's core functions
    (e.g. which code will be used when we typed runserver subcommand)
  • The function that map parsed HTTP request/response to python object
  • The function that maintains the session on HTTP streams
  • The function that handles the file transmissions
  • The function that generates SQL queries through ORM-related methods
    (e.g. Model.objects.get(id=1))

So I started to focus on analyzing the above items to analyze the overall structure of Django.

__

3. Analyzing the bug cases

After analyzing the main features, the overall structure of the program began to be somewhat familiar. So I started to search for the disclosed 1-day vulnerabilities. Based on understanding of Django's structure, you can read the code efficiently.

The links below are the vulnerabilities I analyzed in the above process, so refer to them if you are interested in.

https://ufo.stealien.com/2022-12-16/analyzing-django-orm-with-1-day (KOR)
https://new-blog.ch4n3.kr/cve-2022-28347/ (KOR)
https://new-blog.ch4n3.kr/cve-2022-34265-analysis/ (KOR)

There were three SQLi vulnerabilities evaluated as high severity in last year. So I decided to find the security vulnerabilities especially focusing on SQL Injection at that time. But I couldn't find any more,, so I became exhausted to analyze. ๐Ÿ˜“ Since then, I have spent not much time for finding Django's vulnerabilities.

After quite a long time, I came up with an idea while analyzing the vulnerabilities below.

https://www.djangoproject.com/weblog/2023/feb/01/security-releases/

This bug case points out that complex regular expressions can be vulnerable to DoS attacks in situations where they have to handle very large strings. While implementing regular expressions in each language may be more complicated, searches in regular expressions should be similar to Linear Search because they require searching for all strings from start to end. Therefore, if a large amount of text is entered as a factor in a regular expression, a significant computational resource will be consumed. In conclusion, this is the ReDoS attack that targets to put large string to the complex Regex.

Please refer to my post for a detailed explanation of the above vulnerability. Please turn off the adblock (joke lol ๐Ÿ˜…)

https://new-blog.ch4n3.kr/cve-2023-23969/ (EN)

__

4. Planning scenario

Last year, when I analyzed the source code to find vulnerabilities in SQL Injection, I had an experience of quickly getting tired because I didn't have any scenarios when I read the code. So this time, I started to analyze the source code only if I had a scenario. Analyzing 1-day vulnerabilities brings up similar scenarios. I wrote this down in the note app and started analyzing it later when I had spare time. I think this series of methods has greatly improved the efficiency of the analysis (because this methodology is a part of each hacker's individual difference, so I want you to just refer to it ๐Ÿ˜…)

I analyzed how CVE-2023-23969 vulnerabilities were patched, and analyzed other 1-day vulnerabilities. So I could guess the conditions that must be satisfied to be accepted as security vulnerability, and it is as follows.

  • When regular expression notation such as +, *, {0,n} is used redundantly and repeats a lot in finite automata
  • When using the above regular expression notation, if length verification is not preceded
  • If the value sent by the user is highly likely to be entered as an argument in a function that satisfies the above conditions

__

5. Analyzing the code

Two vulnerabilities were discovered during the verification process, as per the previously planned scenario. I will use EmailValidator as an example to explain how these vulnerabilities are caused.

EmailValidator is a validator class present in django.core.validators. It checks whether the user-supplied string adheres to the email format.

# django/core/validators.py

@deconstructible
class EmailValidator:
    # ...
    domain_regex = _lazy_re_compile(
        # max length for domain name labels is 63 characters per RFC 1034
        r"((?:[A-Z0-9](?:[A-Z0-9-]{0,61}[A-Z0-9])?\.)+)(?:[A-Z0-9-]{2,63}(?<!-))\Z",
        re.IGNORECASE,
    )
    # ...

In this validator, an internal variable named domain_regex stores the regular expression used to validate the domain. This internal variable is used to check the domain of the email. Since this regular expression uses overlapping {0,61} and + notations, it can potentially validate a very large string.

However, if length validation precludes excessively large strings from being passed as arguments to the regular expression checks, it becomes challenging to trigger a ReDoS attack. As a result of testing with my laptop (Apple MacBook Pro 13" / M2, 16GB), in the case of EmailValidator, a significant time delay of over 100ms occurred in the regular expression operation process only when a string of 1MB or larger was entered as an argument.

Therefore, if this validator had appropriate length validation logic, it would have been safeguarded against ReDoS attacks. However, both EmailValidator and URLValidator, the targets of these vulnerabilities, do not have the necessary verification logic prior to regular expression searching, making them vulnerable to ReDoS attacks.

Since both email addresses and URLs have formats standardized by the RFC, ReDoS attacks could have been sufficiently mitigated if length verification logic (according to RFC standard) was added.

__

6. Sending Report

Most large open source projects or software have a Security Policy. In the case of Django, I was able to check the related information from the link below.

https://docs.djangoproject.com/en/dev/internals/security/

After finding two vulnerabilities, I sent a security report to the Django Security Team by email, and after about 2 weeks, I was able to get a CVE number assigned. ๐Ÿ‘๐Ÿ‘ During this process, I felt that the Django Security Team is a very professional team from vulnerability reception to patching and feedback.

__

7. Conclusion

So far, I have shared my experience and thought flow of discovering security vulnerabilities in Django, and have written the process of sending a security report and getting assigned a CVE number. Finally, I would like to express my gratitude to yelang123, NGA, and the members of the STEALIEN R&D team who gave me great courage and motivation. ๐Ÿ™‡โ€โ™‚๏ธ