The mySites.guru suspect content tool – One of the many unique tools within the mySites.guru audit and Ultimate Toolset for WordPress/Joomla, is the Suspect Content Tool. This is our most popular unique tool, let’s discuss it today.
The average number of files in the 75,000+ sites connected to mySites.guru is just under 20,000 files. đ˛
Overview
The mySites.guru suspect content tool is a cool feature in the mySites.guru service’s audit suite.
This tool helps you find and check files that might be problematic or follow our rules for different reasons.
It’s like finding important stuff in a big pile of files on your website. Instead of going through a million files by hand, we give you a short list to look at.
Data Gathering – the mySites.guru audit
The start of the process is to run a mySites.guru audit.
This gathers information on every file in your webspace – with no exceptions. The audit process takes a short while but you can walk away from the screen and come back later.
For subscribers you can schedule audits to happen as often as you like or on demand
At the start of every audit we also run our snapshot tools, capturing over 100 quick checks of your site. Added to the audit thatâs even more checks!
The audit first compiles a list of all the folders in your webspace â without exceptions â and then grabs a list of the files in those folders.
We then run an exhaustive process which includes:
- Determine if the file belongs to the core of Joomla or WordPress.
- If it’s a core file, check if it has been altered since its release.
- If the core file is modified, compare it with the original file.
- Save the md5 hash of the file for future comparisons.
- Go through every line of code in each file.
- Scan each line for nearly 2000 known patterns of previous hacks; if found, label the file as “suspect.”
- Verify the md5 hash of the file against a database of over 14,000 confirmed “hacked” file hashesâno false positives, as each hash has been manually validated.
- Examine file metadata, including creation and modification dates, and explore EXIF data on images where hacks are commonly found.
- Identify encrypted files, PHP error logs, archive files, files larger than 2MB, zero-byte files, and other file classifications.
Once the audit is over we notify you so you can login to mySites.guru and review the results. The screenshot below shows the first three sections of the audit tab.
Check every line in every file
During the audit, every line in each file within your webspace is thoroughly examined, with no exceptions except for our internal list of exclusions and whitelisted files/hashes/patterns.
What sets us apart from other “scanners” is our audit looks at the content of every file in your webspace. We inspect all files in your webspace, not just the rendered output of your site. We even analyze files not actively used for rendering your site, often referred to as backdoors. These hidden vulnerabilities may linger in your webspace unnoticed for years before being discovered and exploited by hackers.
Suspect Content Match
The mySites.guru audit has several ways to identify suspect content.
The main two are:
- Over 2000 regex patterns based on historic hacks seen on real Joomla/WordPress sites over the years, including emerging hacks and mutated hacks seen in the last few weeks. These files are just suspect – there will be false positives by design – and not everything from this list will be bad, or hacks, or backdoors.
- Whole file hashes, that match the whole file, instantly marking these files as hacked, with a red [Hacked] label in the file tool results. These are normally backdoors and match hashes of files we have seen in the past.
Complete file match – md5 hash
When we discover a backdoor file (like c99, r57 or any file that is hacked) we calculate the md5 hash of the entire file and store that.
On the next audit of any site connected to mySites.guru, we distribute these new hashes and look for files in YOUR webspace that might match these.
If a file on your site matches any of our hashes then we will mark that with a red [HACKED FILE] flag in the audit results.
There are no false positives here. If the hash matches it IS a hacked file. Fact.
Single pattern match – Over 2000 regex patterns
The second level of detection is suspect content based on regex.
“regex” is a pattern based match based on over 2000 patterns we have curated over the last decade, and improve daily.
These find the normal things like use of eval()
functions, base64_decode
and gzinflate
on the same line and the other cool tricks that hackers use.
Our regex patterns find PARTS of hacks also, in this way a hacked file can be inspected on a line by line basis to find smaller snippets of hacks, where a file has been injected with a hack as opposed to the whole file being a whole hack/backdoor.
Not every match with our patterns necessarily signifies a hack; this is intentional. The challenge stems from PHP being the shared language for both your authentic code and the hacks (primarily). Think of it like both parties conversing in English. If we search for a common word like “The,” matches will emerge in both hacks and genuine code. Yet, we diligently work to keep legitimate matches to a minimum. This means you won’t have to comb through 20,000 files to find a hack; the mySites.guru audit pinpoints and allows you to review only a handful of identified files.
Reduce your time looking for hacks
The average site based on the 63,000 sites mySites.guru is connected to, has 19,882 files! That is a lot of files to manually sort through looking for hacks.
The results of the mySites.guru audit give you a handful of files to look through – what’s more, we give you an easy interface to view the exact lines of the exact files that we list – no need to fire up your FTP application or look through your file system manually.
If your file is a known backdoor for a hacker â we mark it as such!
By clicking any of the file names, you can see a preview of the section of the file we think is suspect. You can also see when it was modified, its size, and its permissions.
You can use our tools to edit the file directly in mySites.guru and then save the changes, and we will upload them to your site â no need to find your FTP Client! You can also delete the whole file with a single click.
Crowd Sourced Data Model
After every audit, the mySites.guru detection improves. Anonymous data on the suspect content found is submitted to our internal queue and after manual review is added to future iterations of our data model.
In plain language, this means if a new hack is found on a Joomla Site, then on the next audit of YOUR Joomla sites, we will look for that hack – this means by being connected to mySites.guru you benefit from all the knowledge gained in fixing and identifying hacks on all other sites.
This also allows us to track trends and waves of infections and improve the detection of new and mutated hacks and backdoors.
This data model improvement alone makes mySites.guru unique and sets us apart!
Detection Improves Daily
Across the mySites.guru service we run over 3000 audits of Joomla and WordPress sites per day (at the time or writing) – this means we always have up to date information on the very latest hacks and waves of backdoors seen across the world.
We find over 200 hacked sites a week.
What about false positives?
Not everything that match our patterns will be a hack!
This is by design!
The problem is, PHP is the language that your genuine code is written in and PHP is the language used by the hacks (mainly) and therefore you are both using the same language – for example, if you both spoke English, and we searched for the word “The” then there would be matches in both hacks and genuine code. However, we work very hard to reduce the number of genuine matches to a minimum – thus instead of you having to look at 20,000 files for a hack, you can identify and review a handful fo files that the mySites.guru audit identifies.
Can I whitelist files or folders?
No.
To allow exceptions would defeat the inclusiveness of the tool and water down its effectiveness.
I expect you WILL get false positives, and that is fine, annoying but fine. You get these because we have chosen to show you rather than hide these files, just in case. Sometimes pattern matching is not enough and a human with experience in code needs to make a judgement call on a file (Feel free to ask me for a quick peek!)
You see, hackers use the same code that good developers use (Like curl, file_get_contents, $_GET etc…) – so sometimes it’s hard to tell if some code is evil, without context, and you cannot get context with a dumb tool that pattern matches.
We do not allow users to whitelist anything anymore
We used to, then it soon became clear that some users donât have a clue whatâs a hack and whatâs not – and a user whitelisted everything and then sued us for not telling him his site was hacked. After legal fees we were ÂŁ14,000 out of pocket.
Plus as mySites.guru uses crowdsourced data and machine learning, too many âfakeâ whitelists has a huge knock on effect to our integrity.
I personally am the only one that whitelists, and I do it rarely
The whole point of the tool, as it clearly explains, is to generate false positives as well as 100% exact matches – this way we also capture emerging hacks and extremely bad practice by extension developers.
Comparison to external “scanners”
Other services claim to have an âauditâ tool. Most of the time they mean they have implemented the Sucuri SiteCheck API, which only âscansâ your site as a visiting browser would, it doesnât check the files in your webspace, and doesnât find anything that is hidden under the surface of your rendered webpages. Be warned. Not all âAuditsâ are in-depth and comprehensive!Make sure you compare apples with apples. Not everyone claiming to be an âappleâ is.
Current Limitations
We currently (at the time or writing) do not scan database tables for malware – meaning that we will sometimes miss WordPress SQL injected posts. We are actively working on a solution for this.
BONUS: Out of your depth? Need help?
If the mySites.guru audit finds your Joomla or WordPress site is hacked, and you are unsure how to fix it with our tools, or just want us to take care of everything for you, you can escalate this to us using the service at https://fix.mysites.guru/ for SET FEE priced hack fixes.
Last updated on January 5th, 2025