dfir it!

responding to incidents with candied bacon

Webshells - Every Time the Same Story…(Part 3)

Last blog post in this series described the analysis of the attack with the use of webshells. Such attacks showed how difficult it is to ensure the security of the entire infrastructure to defend against them. This part focuses on the evaluation of available tools and providing prevention and mitigation recommendations.

Webshell detection tools

I have evaluated the following projects focusing on webshells detection:

These tools were tested against the files presented in part 1 with addition of a few new ones:

The conducted tests verified the detection accuracy of all tools when faced with a combination of different webshells mixed with hundreds of valid files from GitHub repositories and other public sources:

  • index.html from different popular websites
  • ASPX files
  • PHP files
  • JavaScript files

NeoPI

At first, I tested NeoPI. According to project’s GitHub page, NeoPI is a Python script that uses a variety of statistical methods to detect obfuscated and encrypted content. Below output presents result of running a tool against a set of aforementioned files:


[[ Total files scanned: 4323 ]]
[[ Total files ignored: 0 ]]
[[ Scan Time: 16.773207 seconds ]]

[[ Average IC for Search ]]
0.0762022597838

[[ Top 10 lowest IC files ]]
  0.0153        ../webshell_db_short/myluph.php<
  0.0168        ../webshell_db_short/vero.txt
  0.0202        ../webshell_db_short/unknownPHP.php
  0.0248        ../webshell_db_short/phpcollection/2.php
  0.0262        ../webshell_db_short/myluphdecoded.php
  0.0268        ../webshell_db_short/phpcollection/wkv3.php
  0.0270        ../webshell_db_short/china.aspx
  0.0284        ../webshell_db_short/phpcollection/agenda.ics.php
  0.0285        ../webshell_db_short/phpcollection/config.xml.php
  0.0289        ../webshell_db_short/phpcollection/uploads.php

[[ Top 10 entropic files for a given search ]]
  6.2409        ../webshell_db_short/phpcollection/phpmailer.lang-zh.php
  6.2355        ../webshell_db_short/phpcollection/phpmailer.lang-zh_cn.php
  6.1932        ../webshell_db_short/unknownPHP.php
  6.1622        ../webshell_db_short/phpcollection/phpmailer.lang-ch.php
  6.0307        ../webshell_db_short/vero.txt
  6.0258        ../webshell_db_short/myluph.php
  6.0151        ../webshell_db_short/phpcollection/phpmailer.lang-ko.php
  5.9169        ../webshell_db_short/phpcollection/phpmailer.lang-ja.php
  5.7736        ../webshell_db_short/phpcollection/1.php
  5.7393        ../webshell_db_short/phpcollection/phpmailer.lang-vi.php

[[ Top 10 longest word files ]]
  554750        ../webshell_db_short/phpcollection/wkv3.php
   11999        ../webshell_db_short/phpcollection/full_dump.php
   11999        ../webshell_db_short/phpcollection/contentobjects.php
    1774        ../webshell_db_short/myluph.php
     660        ../webshell_db_short/vero.txt
     641        ../webshell_db_short/c99shell.php
     547        ../webshell_db_short/phpcollection/EmailAddressValidator.php
     356        ../webshell_db_short/phpcollection/priv.txt
     197        ../webshell_db_short/phpcollection/emission.xml (2).php
     197        ../webshell_db_short/phpcollection/emission.xml.php

[[ Top 10 signature match counts ]]
      85        ../webshell_db_short/c99shell.php
      35        ../webshell_db_short/phpcollection/run-tests.php
      27        ../webshell_db_short/phpcollection/WikiComments.aspx
      24        ../webshell_db_short/phpcollection/MemberSearch.aspx
      22        ../webshell_db_short/phpcollection/CustomPageManagement.aspx
      22        ../webshell_db_short/phpcollection/Comments.aspx
      20        ../webshell_db_short/phpcollection/phpmailerTest.php
      20        ../webshell_db_short/phpcollection/ManageTerms.aspx
      20        ../webshell_db_short/phpcollection/TimestampIntegrationTest.php
      17        ../webshell_db_short/byroe.jpg

[[ Top cumulative ranked files ]]
      56        ../webshell_db_short/myluph.php
      57        ../webshell_db_short/vero.txt
     176        ../webshell_db_short/c99shell.php
     219        ../webshell_db_short/phpcollection/wkv3.php
     225        ../webshell_db_short/phpcollection/1.php
     372        ../webshell_db_short/myluphdecoded.php
     444        ../webshell_db_short/phpcollection/profile.php
     525        ../webshell_db_short/phpcollection/WikiComments.aspx
     570        ../webshell_db_short/phpcollection/uploadpostattachment.aspx
     595        ../webshell_db_short/phpcollection/Fields.aspx

Pros:

  • detection ratio: 6 out of 9 webshell files
  • successful detection of clean and obfuscated code of the same webshell
  • the more complex code structure is, the better results and detection ratio
  • various methodologies to detect webshells - signatures, index of coincidence (IC), ratio, entropy, longest keyword matching

Cons:

  • failed detection of simple one-line webshells (e.g. China Chopper)
  • false negatives and positives in different categories, including final rankings
  • manual triage and additional analysis of the highlighted files is required for some of the methodologies (e.g. entropy, keyword matching)
  • signature database is outdated as the project appears to be not developed anymore
  • webshells hidden inside of another file format (byroe.jpg) will be not detected in wide spectrum of files - NeoIP produce massive false positive

I’ve noticed it would be really helpful to combine summary information about a files detected by more than one heuristic. For instance in my test byroe.jpg was visible in top ten signature matches, longest word and entropy but not in Top cumulative ranked files.

Taking into account that NeoPI wasn’t updated for last 4 years, didn’t detect all types of webshells, generated number of false negatives, it still had quite impressive detection rates of a relatively new webshell samples. I can recommend adding NeoIP to webshell analysis toolbox. InfoSec Institute has a nice write-up on NeoIP with some additional details.

Shell Detector

Shell Detector was a second tool that I have evaluated. I really liked how the results were presented in console:

There is also a web version available here.

Pros:

  • detection ratio: 7 out of 9 webshell files (5 as suspicious + 2 webshell)
  • successful detection of clean and obfuscated code of the same webshell.
  • provided final results in clear graphical form

Cons:

  • 131 false positives based on suspicious word existence
  • only signature based detection
  • webshell signature database out of date
  • sluggish interface when number of results is too high (Web version)
  • signature database is written in serialized php format (not scalable)
  • byroe.jpg was not detected by Shell Detector - not support JPG files

To sum up even though the signature database file appears to be out of date the tool correctly determined almost all files to be malicious. This tool can provide powerful detection capability as long as signature database is kept up to date.

LOKI

LOKI presents scan results in a terminal, coloring entries depending on their severity. It also outputs all matches to a single log file. The rules are written in YARA, easy to use yet very powerful language to identify and classify malware which appears to be a tool of choice by the security industry. According to project’s website most effective rules were borrowed from the rule sets of his bigger brother THOR APT Scanner. For me, the most interesting were the ones dedicated to webshells detection.

My first scan of a sample set with a default signature database showed moderate detection ratio (5/9). With YARA growing popularity among infosec world, it’s possible to build and maintain a powerful database to hunt malware including webshells and research new obfuscation techniques and variants observed in the wild. Taking that into account, I decided to improve the results obtained previously. I found set of rules, that almost perfectly match my expectation. After a quick adjustment, final score was close to ideal - ratio (8/9). It were really a tiny changes, so I’ll shortly describe it:

  • Change $php parameter to “<?” in new rule created based on misc_php_exploits
  • Add “system($_REQUEST” in misc_php_exploits and newly created rule from point above
  • Remove two strings in rule misc_shells - $s6 and $s8 (that one was even marked with a comment that it could generate FP, so it was easy ;)

After all of that, as a result I received the biggest advantage of LOKI - false positive number was zero!

Pros:

  • detection ratio: 8 out of 9 webshell files
  • successful detection of clean and obfuscated code of the same webshell.
  • provided final results in clear log file
  • zero false positives(but that really depends on Yara rule set you use)
  • easy to develop signatures based on Yara rule
  • supports all extensions

Cons:

  • only signature based detection for webshells

Summary

To sum up the results from all the tools, it’s really hard task to develop one tool which will mark with good accuracy webshells as suspicious. It’s because there is a wide range of different functions, methods, encodings which would be use to achieve the same effect. Attackers don’t need to use base64_decode function to decode their base64 code. Instead, they can add their own proprietary function to do exactly that. They can use a string lookup array to avoid keyword-based detection or invoke function names by string with str_replace and much more. Imperva did a great research describing various teqchniques in their blog post.

The only webshell not detected by LOKI was unknownPHP.php which obfuscation technique is really advanced - thanks to Darryl from Kahu Security, you can follow the decoding process in a great post. As its not possible to detect it using general signature rules, NeoPI methods (entropy, Index of Coincidence) are an excellent solution for this kind of backdoors. Together with LOKI, it seems to be a powerful weapon to detect webshells.

Prevention and mitigation

There are a few things that can be done to protect organizations against a server compromises:

  • PATCH! - it sounds silly, because it seems SO obvious but last year showed that even a well-known attack like Heartbleed doesn’t guarantee that administrators do their job. Two months after the public release, there were still around 300k vulnerable servers
  • harden your web server - implement a least-privileges policy on the web server, limit script execution permissions in specific locations etc.
  • deploy DMZ (demilitarized zone) - enable logging of allowed and blocked traffic, limit interaction between DMZ and your production environment
  • deploy reverse proxy with WAF (Web Application Firewall) - restrict accessible URL paths for only legitimate sources using for example free Mod-Security or other comercial product, consider fuzzy hash matching
  • regular test your environment - conduct virus signature(e.g. use by WAF) checks, application fuzzing, code reviews and server network analysis
  • regular test system and application - regularly check the application’s security - pentest and vulnerability scans to establish areas of risk
  • versioning + backup - establish offline a “well-known good” backup all critical servers, enable monitoring for changes to have clear history on servers
  • user validation - employ user input validation to restrict local and remote file inclusion vulnerabilities
  • scan all incoming files to web server (if you accepting file upload from users) - as it was shown before, the administrator can not trust the extensions of the files, all of this could be just a trick to hide malware
  • always follow up social media discussion!;)

Community also has its own ideas:

  • AV/HIDS scan of the web server…

Let me digress a little about the last recommendation. First of all, as you know, AV is not a fail-safe mechanism, so you cannot trust it fully. AV products do not protect against all types of attack vectors. It is relatively easy to bypass AV. As a result, you can at least block known malicious code (detected by signatures or heuristics) - not ideal but still an advantage.

When you’ve got AV on your web server (or any other machine for that matter) you need to know that there are costs involved:

  • introduce additional risk to your machine by adding code which could be vulnerable to different type of attacks like RCE, local priviliges escalation, sandbox escape, etc. Details can be found on Joxean Koret’s presentation or Google Project Zero posts (1, 2)
  • performance - every AV generate some efficiency loss, it is periodically measured and reported by AV-Comparatives organization - lastest can be found here

Conclusion

The whole series was intended to familiarize you with how popular, diverse and at the same time dangerous are attacks leveraging webshells. As the second part of this series showed, crooks aim was targeting specific companies and webshells are only a small part of bigger plan. Variety, diversity and simplicity of webshells causes the defense against them to be a very difficult task. Even if you fill all the recommendations of the section “prevention and mitigation” does not guarantee that your application/environment is 100% safe, but it is important to build security in a comprehensive manner and to leave as little space as possible to beat our “entanglements” ;) Keep fighting! Keep defending!

Comments