I Caught 1300+ of Cyberattacks Using a Python Honeypot 🍯: Real Data, Vulnerability Analysis, and Tips

Introduction

In recent years, cybersecurity has become one of the top concerns for developers and companies. To explore this field hands-on, I launched a small experiment: building a honeypot, a trap designed to attract malicious bots, log their behavior, and analyze the types of attacks they perform.

The full project is available on GitHub: https://github.com/fabiobiffi/honeypot


What Is a Honeypot?

A honeypot is a system intentionally exposed to the Internet with the goal of being attacked. It doesn’t contain sensitive data but acts as bait to study malicious behaviors. It can be a full server, a web app, or a basic API endpoint that logs every incoming interaction.

In my case, it’s a simple Python-based HTTP server that logs all incoming HTTP requests, including headers, methods, and query parameters.


Technical Setup

The project is written in Python and uses a lightweight HTTP server that logs each request to a file. Each log entry includes:

  • Timestamp
  • Source IP
  • HTTP Method
  • Requested Path
  • User-Agent
  • Query string
  • Optional POST payload

To expose the server online, I used a Linux virtual machine with Docker and Nginx acting as a reverse proxy.


Honeypot Code Example

Here’s a simplified version of the Python code used to intercept and log requests:

from flask import Flask, request
import logging
from logging.handlers import RotatingFileHandler
import os

app = Flask(__name__)


# Configure logging
def setup_logger():
    logger = logging.getLogger("honeypot")
    logger.setLevel(logging.INFO)

    # Create handlers
    file_handler = RotatingFileHandler(
        "honeypot_logs.log", maxBytes=10000000, backupCount=5
    )
    console_handler = logging.StreamHandler()

    # Create formatters and add it to handlers
    log_format = logging.Formatter("%(asctime)s - %(levelname)s - %(message)s")
    file_handler.setFormatter(log_format)
    console_handler.setFormatter(log_format)

    # Add handlers to the logger
    logger.addHandler(file_handler)
    logger.addHandler(console_handler)

    return logger


logger = setup_logger()


@app.before_request
def log_request():

    if request.path == "/favicon.ico":
        # Ignore favicon requests
        return "", 204

    log_message = {
        "ip": request.remote_addr,
        "path": request.path,
        "method": request.method,
        "user_agent": request.headers.get("User-Agent"),
        "query": request.query_string.decode(),
    }

    if request.method == "POST":
        log_message["post_data"] = request.get_data(as_text=True)

    logger.info(f"Request detected: {log_message}")


@app.route("/", defaults={"path": ""})
@app.route("/<path:path>", methods=["GET", "POST"])
def catch_all(path):
    return "OK", 200


if __name__ == "__main__":
    port = int(os.environ.get("HONEYPOT_PORT", 8080))
    app.run(host="0.0.0.0", port=port)

The script is designed to be simple and easy to understand. It was enough to attract and analyze a surprising number of automated bots.


Goals of the Experiment

My primary objectives were:

  1. Capture and analyze suspicious requests.
  2. Identify common vulnerabilities exploited by bots.
  3. Study patterns in bot traffic.
  4. Learn how to better secure real-world applications.

Collected Data Analysis

My honeypot received over 1300 requests, more than 100 of which were potentially malicious. Below are some of the main attack types I identified.

Here you can download the original logs file (converted to txt):

1. Access Attempts to .env Files

Many requests attempted to access .env files, commonly found in Laravel and Node.js projects:

GET /.env

These attacks aim to extract environment variables that may contain passwords, API keys, and database credentials.

Common paths targeted:

  • /admin/.env
  • /api/.env
  • /config/.env

Recommended protection: Never expose .env files in production. Configure your web server to block access.

2. CVE-2024-4577: PHP Remote Code Execution

One of the most sophisticated attacks looked like this:

POST /hello.world
Query: %ADd+allow_url_include%3d1+%ADd+auto_prepend_file%3dphp://input
Payload: <?php shell_exec(base64_decode(...)); echo(md5("Hello CVE-2024-4577")); ?>

This is an exploit for CVE-2024-4577, a Remote Code Execution vulnerability in PHP on misconfigured Windows environments.

Recommended protection: Update PHP and disable dangerous functions like allow_url_include.

3. PHPUnit eval-stdin.php (CVE-2017-9841)

Dozens of requests targeted:

/vendor/phpunit/phpunit/src/Util/PHP/eval-stdin.php

This file allows arbitrary PHP code execution if left on production servers. Related to CVE-2017-9841.

Recommended protection: Remove test tools and files before deploying to production.

4. ThinkPHP Remote Execution

Attacks against the ThinkPHP framework included:

/index.php?s=/index/\think\app/invokefunction&function=call_user_func_array&vars[0]=md5&vars[1][]=Hello

This is typical of RCE exploits found in older ThinkPHP versions, similar to CVE-2018-20062 and CVE-2019-9082.

5. Luci CGI-bin Exploit (Router Injection)

Several requests like:

/cgi-bin/luci/;stok=/locale

Target the web interface of OpenWRT routers. Known vulnerabilities include CVE-2021-20090.

6. Exposed Git Files

Requests to:

/.git/config

can lead to full source code exposure if the .git directory is publicly accessible.


Bot Profile Analysis

  • Modified Mozilla/5.0 headers: used to bypass basic filters.
  • python-requests, curl, aiohttp: common Python-based scrapers.
  • CensysInspect, Xpanse, zgrab: reconnaissance tools.
  • Custom-AsyncHttpClient: often seen in RCE and probing requests.

Statistics Snapshot

  • Total requests: 578
  • GET requests: 98%
  • POST payloads with malicious content: 6
  • Requests for .env files: 74
  • PHPUnit exploit attempts: 38
  • Unique User-Agents: 112

Key Takeaways

  1. Malicious bots scan new IPs almost instantly.
  2. Most attacks focus on publicly known vulnerabilities.
  3. Servers with default settings can be compromised within minutes.
  4. User-Agent strings and URL paths reveal attacker intent.

How to Secure a Real Application

  • Block access to sensitive files via web server configuration
  • Keep frameworks and dependencies up to date
  • Disable risky PHP functions (shell_exec, eval, etc.)
  • Use a Web Application Firewall (WAF)
  • Monitor access logs continuously

References & Sources

The vulnerability analysis presented in this article was supported by information from the following sources:


Conclusion

This experiment proved that even a basic honeypot can reveal a lot about the current threat landscape. Honeypots are valuable for:

  • increasing threat awareness,
  • testing and improving your own defenses,
  • contributing to the wider cybersecurity community.

If you found this article useful, feel free to follow me on my blog or GitHub. The honeypot code is public and can be reused, adapted, or expanded.

GitHub: https://github.com/fabiobiffi/honeypot

Leave a Comment