Challenge Description
Description
Someone once told me “Why settle for boring old robots.txt when you can have clankers.txt guarding your website?”
- Author: Jun Wei
- Category: web
- Difficulty: beginner
Disallow Robots - Solution
What is
robots.txt?
robots.txtis a text file placed at the root of a website (e.g.,https://example.com/robots.txt) that tells web crawlers and search engine bots which pages they are allowed or disallowed from indexing.Example format:
User-agent: * Disallow: /admin/ Disallow: /secret-pageWhile
robots.txtis intended for polite bots, it is not a security mechanism. Any human can simply visitrobots.txtdirectly in a browser. In CTF challenges (and real-world security assessments),robots.txtoften inadvertently reveals sensitive or hidden paths that the site owner didn’t want indexed.
-
Connect to the VPN and visit the website.
-
Press
Ctrl + Uto view the page source. The source code comments hint at checkingrobots.txt. Alternatively, directly navigate to/robots.txton the target website, which is always publicly accessible.
Here are the contents of robots.txt:
User-agent: *
# Congratulations, you found the first clue! The path you're not supposed to visit is the one you must visit.
Disallow: /f1ag_h3r3_n0w_th3_r0b0ts_path- Navigate to the disallowed path (
/f1ag_h3r3_n0w_th3_r0b0ts_path) in your browser. The flag will be shown in plaintext.
Quick shortcut
In any web CTF, checking
/robots.txtand viewing the page source (Ctrl + U) should be among your first steps. They frequently reveal hidden paths, comments, or credentials.
Flag: HNF25{s0uRc3_c0d3_anD_r0b0ts_m4tt3r_s0m3t1m3s!}