Create a Robots.txt Honeypot Cyber Counterintelligence Tool

What is the Robots.txt File?

The Robots.txt is just a simple text file meant to be consumed by search engines and web crawlers containing structured text that explains rules for crawling your website.

In theory, the Search Engines are supposed to honor the Robot.txt rules and not scan any URLs in the Robots.txt file if told not to.

Robots.txt was supposed to help avoid overloading websites with requests. According to Google, it is not a mechanism for keeping a webpage out of Google Search results.

If you really want to keep a web page out of Google you should try adding a noindex tag reference or password-protect the page.

With a Robots.txt file, you can create rules for user agents specifying what directories they can access or disallow them all. It all sounds OK in principal but on the internet, nobody really plays by the rules.

In fact, the Robots.txt file is one of the first places a bad guy might look for information on how your website is structured.

Too many websites make the mistake of using the Robot.txt file without giving thought to the fact they might be rewarding possible OSINT or hacking reconnaissance efforts at the same time.

A Look at Amazon.com’s Robots.txt File

If we take a quick look a big website like Amamzon.com to see what their Robots.txt file looks like all we have to do is load up this URL.

https://www.amazon.com/robots.txt

What files or directories does Amazon tell the Google search engine not to crawl or index?

It looks like account access login and email a friend features are off limits so these are the first places a hacker will be looking.

More Sample Robots.txt files from Google

# Example 1: Block only Googlebot
User-agent: Googlebot
Disallow: /

# Example 2: Block Googlebot and Adsbot
User-agent: Googlebot
User-agent: AdsBot-Google
Disallow: /

# Example 3: Block all but AdsBot crawlers
User-agent: *,
Disallow: /

You can find more detailed information on how to make more complex robots.txt files over on the Google Search Central area for developers.

What is Counterintelligence?

Counterintelligence typically describes an activity aimed at protecting an agency’s intelligence program from an opposition’s intelligence service.

More specifically Information collection activities related to preventing espionagesabotageassassinations or other intelligence activities conducted by, for, or on behalf of foreign powers, organizations or persons.

In terms of this blog post, we’ll use a Honeypot based approach to see who is using looking at the Robots.txt file and scanning folders we’ve asked them not to and record information about the HTTP call for later review and analysis.

A Real World Robots.txt Based HoneyPot Example

Using the Robots.txt file as part of a honeypot system, we will broadcast a list of honeypot folders we don’t want search engines to index, but in this case, it will be a list of folders pointing to honeypot pages.

Having a honeypot / data collection service running in these folders allows you to see who is using the Robots.txt file to scan your web server thus tipping you off that OSINT footprinting activity on your webserver or domain names may be taking place.

These folders have a Disallow rule but contain honeypot code to collect information about the HTTP calls made against them and in some cases to redirect the user-agent somewhere else.

A Sample Honeypot Robots.txt File

A Sample Robots.txt file where we are telling all user-agents to stay away from our admin, wordpress and api folders.

# All other directories on the site are allowed by default
User-agent: *
Disallow: /admin/
Disallow: /wordpress/
Disallow: /api/

The Robots.txt file I’m discussing in this post was collected from the online classifieds website, FinditClassifieds.com.

If you try to hit any of the URLs found in the Robots.txt file, you’ll be redirected to a Rick Roll video on YouTube.com. IP address data is collected in a log for a more detailed review.

Each one of these honeypot URLs do a little something different.

The first honeypot URL replies with something naughty, the second logs it as a scan and the third URL is the Rick Roll redirect.

Depending on the honeypot page, you can collect data from the user-agent and log it before you redirect them off to the Land of Oz.

Using a tool that was originally created to be helpful that ended up becoming dangerous can now be a double agent if you set it up correctly.

Hoping this helps someone on their InfoSec journey.

~Cyber Abyss

Online Scams: Puppies for Sale or Are They? Probably not! Buyer Beware & Read This before buying a Puppy Online.

Email Address: Your Internet Driver’s License

First things first. We all need an email address in order to do anything meaningful on the the web. You do and the bad guys do too! I would go as far to say that an email address closest thing we have to a driver’s license on the internet today. Without an email address, you are on a read only version of the internet with no way to interact with with world.

By Federal law, you’re not allowed to have an email address until you’re at least 13 years old. This is specified in the FCC’s Children’s Internet Protection Act (CIPA). I often have to advise my clients on these types of issues when deciding who can legally register on a website.

An email address allows you to register on websites by validating your email address. An email address / IP address combo is the easiest and most cost effective way to provide a first pass at knowing who your customers are online.

At least we are supposed to expect that they are at least 13 years old because Google and Yahoo must check this for every email account, right? LOL! This will be important to the story below.

Blue French Bulldog Puppies For Sale or Are They?

First off, let me start by saying you should read this article which is a case study on the Anatomy of a Puppy Scam by Rae Wondersmith. A great primer on the subject!

Next, I will be hiding identity of the suspected scammer while disclosing enough details to be helpful in the analysis of the individual and the patterns observed.

The data I’m sharing comes from anti-fraud systems I’ve designed that are working in production on what I’m hoping will eventually become a popular website for local classified advertising. Maybe I’ll reveal the name of the site at the end of this article.

Blue French Bulldog  Puppies for Sale

Pic of Puppies uploaded by the Scammer

On 11/30/2020 a suspected Puppy Ad Scammer created four (4) accounts in four (4) different cities in a very short period of time.

Three of the accounts came from one ComCast Cable IP in Salem Oregon which matched one of the advertisements which did not raise a red flag initially.

They kept creating new accounts for various cities and creating a single ad for the same dog breed for each account. They targeted Salem Oregon, Kalamazoo, Boston and Lansing.

Then they posted again but IP switched from Oregon Comcast to Verizon Fios in Virginia but anti-fraud tools I built help me see it is indeed the same person registering again from same browser even though the IP had changed. I’m not sure if they are using some sort of VPN to shift the IP / Location.

I know it says “Email NOT VERIFIED” in the screenshots below but they are. I had added an email ban to the system and it reflects back on this view as NOT VERIFIED but they were. The email verification process data sits in its own database table. I collect the IP addresses from the user at start and finish of the email validation process.

I can also see the email exchange in the email server logs files which I also check daily. All of this data can be verified by looking at several system logs.

Connecting Accounts Created by Same Person on Earlier Sessions

Going back thru recent account creations I see another account matching one of the scammers email.

Observations & Fraud Patterns

Broken English or poor grammar.

Example: Breath Taken Blue French Bulldog Puppies Ready Now To GO

Notice poor grammar of Breath Taking as Breath Taken there are other examples through out the text

Phone numbers used

240 Maryland Area code in the phone number used and same phone number using in most of the ads.

Phone number was not used on all of the ads posted by this scammer

Unique Account details repeated

Same password was used on all of the accounts!

This is proprietary but yes, all accounts used the same password.

More Analysis

So far this is what I think I have and is subject to change if new data overrides this.

  • User is probably not native English speaker but may be located physically inside the US.
  • Has methods to change IP via VPN or access to computers in those cities via nefarious methods (hack) in order to hide their real IP address.
  • Its is very easy to create email accounts. This person has many email addresses and personas ready to use or creates them easily and often.
  • Only targeted one breed so far

Raw Data for Analysts

In order to help analysts and law enforcement, below are the actual ad text used in the scam advertisements.

Scam Ad for Puppies #1

Much love we have for them, we are really proud to find them a good pet loving home where they will be spoiled with much love and care. they are home raised, well fed, vet checked, vaccinated and had their first shots, update on shot and dewormed, all in good health and will come with paper we have 240) 242-7140

Suspected Scammer using email address marksmille56@gmail.com for ad posted in Kalamazoo.

Scam Ad for Puppies #2

Akc registered frenchie puppies ready for x-mas ! all shots are up to date. They have already taken flea and tick dose. They have beautiful coatings, are strong,text me (240) 242-7140 for more info

Suspected scammer using email address louisesteel259@gmail.com from ad posted in Boston Mass.

Scam Ad for Puppies #3

We are proud to find a good pet loving home for our cuties. We have lovely, young, pretty healthy males and females available now for a new home. they are home raised, well fed, vet checked, vaccinated and had their first shots, update on shot and dewormed, all in good health and will come with papers. you can contact now for more details

Suspect scammer using email address randyruy71@gmail.com from ad posted in Oregon City, Oregon.

Conclusion

The internet is still the wild wild west and most people don’t understand how it works or how the bad guys use it to take advantage of us.

The above example shows just how hard it is for anyone trying to validate and vet an online user as they create multiple accounts and post data.

I hope the information I’ve provided on this subject is helpful in any research you may be doing on the subject as I expect those would be the only people reading the article down this far.

~Cheers & Happy Hunting!

~Cyber Abyss

How to Hide Executable Code in a Text File using Cloaking and Alternative Data Streams

Hacker Basics: How to Hide an Executable File Inside and Text File

Did you know that hackers can hide an executable file inside of a text file using a technique that uses something called data streams to trick a computer system from seeing text and or executable written in an alternate data stream inside a common text file.

I was pretty impressed the first time I watched someone demonstrate this. I was like, NO WAY! I really thought that this was some wizard level hacker stuff.

I’m no wizard level hacker, although I aspire to be, but I should be good enough to show you how to embed a simple calculator app inside a text file using an alternate data stream.

A big thank you to Cyber Security Expert, Malcolm Shore who presented a similar example in his Cyber Security Foundation online course I recently completed.

How Do Alternate Data Streams Work?

Way back in the old Wild West days when we had the DOS operating system, files used to be simple strings of data. Files are read btye by byte.

Later, in the NTFS file system, files are complex structures. NTFS files at a minimum contain a section called $Data where data is read by an application. $Data is the Data Stream.

Files may have many other sections or streams other than just the $Data section. This is what we call “Alternate Streams”.

THIS IS IMPORTANT: Windows only recognizes data in the $Data section so any data we put in an alternate data stream is not read by the Windows Operating System. We cloak data we want to hide in an alternate data stream. That’s the basics of how this works.

The data we are hiding could be a malicious malware payload or encrypted espionage message for our spy ring but in this example, it is just the simple calc.exe file you can find on any Windows PC for the last 20+ years.

Creating an Alternate Data Stream in a Text File

The screenshot below shows the three (3) files we’ll be using in this demonstration.

  • Simple text file with some string data.
  • calc.exe application or executable binary file
  • Secret text file with some string data

We can see the size of the text file is just 1 KB and the calc.exe file is 897 KB.

If we open the text-data.txt file with Notepad we’ll see just a simple line of text and the same with the secret-data.txt file.

To hide our secret message inside the the text data file, we’ll use this command line command.

C:\text\>type secret-data.txt > text-data.text:hidden.text

Screenshot of Alternate Data Stream: Insert Hidden Text

Below is a screenshot of the command line command “type” that we used in this example to insert our secret-data.txt file into an Alternate Data Stream inside of another text file.

If we type the command “more” we can look for the secret message.

The screenshot below shows the text file that contains our hidden text being opened in Notepad where we can’t see the hidden text we saved to the file. If we type the command line command below, we can read the hidden text we wrote to our Alternate Data Stream by keying in on the specific data stream.

c:\test>more < text-data.txt:hidden:text

Hiding an Executable Inside a Text File

Hiding an executable file inside a text file using the exact same Alternate Data Stream technique we just used in the the Secret text file example above but this time we’ll simply replace the Secret text file with the Windows Calculator application executable file.

The screenshot below shows the command line command to save the calc.exe file in an Alternate Data Stream in side our target text file.

Notice this time, the Alternate Data Stream is named “mycalc.exe”. Don’t get to hung up on this, it is just a name that is basically meta data that is saved with the data that we can use to filter the data we get out of the file. I hope that makes sense.

Important to note at this point that the file sizes didn’t change when we inserted the calc.exe file. It is still showing 52KB.

How to Execute a File Saved in an Alternate Data Stream

To execute a file you’ve stored in an Alternate Data Stream, we’ll need to use the wmic command as is done in the following example.

c:\test>wmic process call "c:\test\text-data.txt:mycalc.exe"

As you can see from the working example above, I was able to embed the calc.exe file inside as well as text file and a secret message.

If the data is text we just need to indicate which stream we saved the data in to retrieve it.

If the data we hid was an executable file, we’ll need to use the Windows “wmic” command line command to call the executable from inside the text file by keying in on the Alternate Data Stream name.

In summary, the technique is crazy easy to pull off without any 3rd party hacking tools. It just requires a little Windows Operating System inside knowledge but is something every good hacker should know.

I hope this helped somebody!
~Cyber Abyss

How to Build Your Own Website Uptime Monitoring Script using VBScript: Part 1

Website Uptime Monitoring: The Basics

There are lots of website uptime monitoring services out there but all the components you need to build your own website monitoring tool can be found in good ole’ Microsoft VBScript.

Stop laughing, I’m not kidding!

In this article, I’ll share with you some scripts and tips I’ve used successfully in the past for monitoring website uptime even if your website is running in a complex load balanced enterprise environment which some of mine are.

VBScript Components for Uptime Monitor

Most people don’t know that VBScript can make Ajax HTTP calls but it can.

We will use VBScript’s ability to make Ajax HTTP calls to our website to see if it responds then put some simple logic around that response to log the results in a text/csv file.

It really is amazingly simple once you get all the code components together.

The ISWebSiteUp Function

The ISWebsiteUp function in my code example takes a URL string and makes an Ajax HTTP call to see if we get a HTTP code 200 or 404 returned meaning website loaded OK.

Once we get our 200 or 404 HTTP response code that, script returns true in the form of a text message box or if script times out you’ll get a false in an error message box.

You might be saying to yourself about now, what about the 404 response code for page not found. Yes, you might want to add some more code to handle that differently than a 2oo OK response but for this script, we just want to know if server is up. If we are pointing to a page at the root of a domain, we don’t typically get 404 errors in reality.

The Script Code

To use this code, copy it in to a text file and save it with a .vbs file extension for VBScript. Once you have the .vbs file, double click on it. You will see the message box with the message, “is up” or “is down”. A super simple example for our core application.


'isWebsiteUp: Takes String URL 
'isWebsiteUp: Returns strMessage in Message Box
Function isWebsiteUp(strURL)

	On Error Resume Next

	Set http = CreateObject("MSXML2.ServerXMLHTTP")
 	'Set http = CreateObject("Microsoft.XmlHttp")
	http.open "GET", strURL, False
	http.send ""

	'Only check for error of the HTTP Get request for 200 or 404 code returned. If any status is returned then the server is up
	if http.responseText <> "" AND err.number = 0 then
		'Commented out showing the response text. Use this for troubleshooting or exploring.
		'msgbox(http.responseText)
		isWebsiteUp = true
		strMessage = "is up"
	else
		isWebsiteUp = false
		strMessage = "is down"
	end if
	Set http = Nothing	

	msgbox(strURL & ":" & strMessage)
	err.clear
End Function

call isWebsiteUp("https://www.google.com") 

What the Web Server Sees in the HTTP call: WinHTTPRequest User Agent

The VBScript Ajax HTTP call to the web server presents itself as a web browser asking for the home page.

In the server logs a server admin may see this “User Agent” in their logs.

Mozilla/4.0 (compatible; Win32; WinHttp.WinHttpRequest.5)

Script Errors & Blocked HTTP Calls

This script works out of the box. Google is the most open website in the world in terms of IPs that their servers accept traffic from as they are in the business of collecting data about everything including every system that connects to it.

Other web servers, like ones I run, may not be so forgiving. Many server admins use many tools at their disposal to filter HTTP request at various levels.

Here are some examples of tools Windows Server Admin have at their disposal to block or filter your script from connecting to their web servers.

Windows Server Admin Tools for Handling HTTP Traffic

  • Firewall IP Restrictions (Window Server Admin)
  • HTTP Response Filtering (IIS Application Server Admin)
  • IP Restrictions (IIS Application Server Admin)

Google Dorking Commands! Search Google for Hidden Files on the Web!

Let me start by saying the title might be a little off, as the files are not technically hidden as much as they are obscure.

While most of us would consider ourselves pretty good Googler searchers these days but the truth is, there is so much more to Google searching than meets the eye.

Introducing… “Google Dorking”

Yes, I said it Google Dorking and it’s not what you might think. Sounds dirty, right? It’s not just me. LOL

Entering Google Dorking Commands also known as Google hacking is about searching Google in a way that filters and brings all sorts or OSINT and InfoSec goodies floating to the top.

Think Before You Dork!!!

Although the information my be available on Google, it does not mean you can use that information to try and hack or gain unauthorized access to a system or individual computer.

Hacking is illegal, don’t do it, don’t talk about it.

With that being said, please be careful, be responsible and please enjoy these Google Dorking Examples for educational purposes.

What Kind of Trouble Can You Get into with the Info in this Article?

Using the Google Dorks commands on this page I was able to find this open camera in down town Tehran, Iran. This all started in May of 2021. See screenshots below.

Again, be careful what you Dork and where you go. A visit to this camera got me a reverse probe XSS attack from Iranian Intelligence. That was fun!

Google Dorking Commands for user names and password in log files

Below are popular Google Dorks I’ve collected but there may be more out there so my advice is to never stop looking.

You can also find more Dorks on popular hacking website like the Google Hacking Database (GHD).

allintext:username filetype:log

Open FTP Servers

intitle:"index of" inurl:ftp

Open Web Cams

Intitle:"webcamXP 5"

inurl:view/index.shtml 

Database Passwords

db_password filetype:env

Git-hub Resources

filetype:inc php -site:github.com -site:sourceforge.net

PHP Variables

filetype:php "Notice: Undefined variable: data in" -forum

Server Configuration Files

intitle:"WAMPSERVER homepage" "Server Configuration" "Apache Version"

Nessus Scan Reports

intitle:"report" ("qualys"|"acunetix"|"nessus"|"netsparker"|"nmap") filetype:pdf

Networking Excel Xls Files

ext:xls netoworking

FrontPage Servers w/ Admin Info

"#-Frontpage-" inurl:administrators.pwd

Unprotected Cameras

inurl:view/index.shtml

Hidden Login Pages

Username password site:com filetype:txt DomainName.com

Domains and Subdomains

site:*.site.com -www
site:*.*.site.com -www
site:*.*.*.site.com -www

Google Dorking Video by Null Byte

Hope this helps somebody!
~Cyber Abyss