Financial fraud losses in the UK totalled £768.8m in 2016, up 2 per cent on 2015, according to Financial Fraud Action UK.…
Posted: 30 Mar 2017 | 9:02 am
Posted: 30 Mar 2017 | 5:44 am
By Jon Oliver and Jayson Pryde
Locality Sensitive Hashing (LSH) is an algorithm known for enabling scalable, approximate nearest neighbor search of objects. LSH enables a precomputation of a hash that can be quickly compared with another hash to ascertain their similarity. A practical application of LSH would be to employ it to optimize data processing and analysis. An example is transportation company Uber, which implemented LSH in the infrastructure that handles much of its data to identify trips with overlapping routes and reduce inconsistencies in GPS data. Trend Micro has been actively researching and publishing reports in this field since 2009. In 2013, we open sourced an implementation of LSH suitable for security solutions: Trend Micro Locality Sensitive Hashing (TLSH).
TLSH is an approach to LSH, a kind of fuzzy hashing that can be employed in machine learning extensions of whitelisting. TLSH can generate hash values which can then be analyzed for similarities. TLSH helps determine if the file is safe to be run on the system based on its similarity to known, legitimate files. Thousands of hashes of different versions of a single application, for instance, can be sorted through and streamlined for comparison and further analysis. Metadata, such as certificates, can then be utilized to confirm if the file is legitimate.
TLSH is also built with proactive collaboration in mind. We have provided open-source tools to help study, evaluate, and further improve TLSH. We also have a regularly updated backend query service that independent security researchers and partners can use to query and compare their files for their similarity to known, good files.
Our researches showed that compared to other open-source versions of LSH, TLSH is a highly accurate similarity digest, more flexible in its range, less vulnerable to attacks, and enables a quick search mechanism. TLSH also supports Linux, Windows (Visual Studio), and Python Extension environments.
How TLSH can Help Enterprises
Identifying safe applications and files for their environment is an important task for any enterprise. This is typically done by IT and system administrators who might use methods such as whitelisting (comparing against a known list of good files), or using certificates. These approaches have limitations though; whitelisting solutions may not have a complete list of good files. Keeping up with rapid changes on a legitimate file can also be challenging, and is a known issue with whitelisting solutions. On the other hand, relying only on certificates can create security holes since a certificate infrastructure can also be compromised.
Enterprises can categorize files as:
Enterprises can adopt certain policies such as:
A. Only allow the execution of files in Group WL.
B. Only allow the execution of files (in Group WL OR in Group TC ).
Policy A may be too restrictive; files and security updates may be present, which may not be on the whitelist. Policy B may be problematic, given how certificate issuers may be abused (malware can be signed by trusted certificate issuers), or how a certificate infrastructure can be compromised.
TLSH allows us to define another group of files:
Group TLSH may still contain malware in that malicious components can be inserted into legitimate applications. A similarity test can be combined with metadata, such as trusted certificate signers, to determine a suitable set of software to allow. This results in a new policy:
C. Only allow the execution of files (in Group WL OR in Intersection [Group TLSH, Group TC])
When added to whitelisting systems, TLSH can intuitively keep pace with a software’s fast turnaround of patches and version releases. This provides a more efficient way for organizations to keep their systems updated, while also significantly reducing false alarms.
In a nutshell, TLSH saves enterprises from having to compare new hashes of an updated version of certain software with other hashes of good files that can be executed in the system. To further demonstrate: in one of our research projects, we were able to streamline querying an application/software with over 50 versions and tens of thousands of unique hashes into a hash value generated by TLSH, which we then confirmed to be a good file that can be safely run on the system by analyzing its digital certificate.
Here, we detail how TLSH works, how it stacks up, and how it can help further secure an enterprise’s perimeter:
Trend Micro Locality Sensitive Hashing (TLSH): An Overview
TLSH helps detect and inspect files that are on, or introduced to, computers. It can be utilized to ensure that only legitimate applications or documents can be run, opened, or saved on the system. TLSH provides a mechanism for promptly comparing the similarity digests of unknown files with a searchable repository of similarity digests of known, legitimate, and allowable files. The idea behind similarity digests is to enable a reliable measure of correlation in terms of identifying similar and unique features on each version of a particular file.
This functionality is organized into these layers:
Applications can be implemented using TLSH as a local service (Figure 1) running on a computer that does its own similarity digest searches and comparisons. It can also be a web service (Figure 2) that handles the search and comparison operations on behalf of the computer system.
When run as a local service, the system receives a set of similarity digests and corresponding file identification of a set of known, legitimate files. The similarity digest search mechanism then builds an index. The similarity digest application will calculate the digests for incoming unknown files. The application then uses the index to perform an approximate nearest neighbor search of the data repository.
TLSH running as a web service is slightly different. Known, legitimate files are first added to a basic file system store on a web server. The similarity digest search mechanism calculates the digests for all the files in the store, and creates the index. The system calculates the digests for incoming unknown files using the same similarity digest computation layer that runs on the web server. The similarity digest of each unknown file is then submitted to the web server where the web service application on the server uses the similarity digest search mechanism to search and compare each unknown digest to digests of known, legitimate files. The file IDs corresponding to the similarity digests of known, legitimate files are then returned to the computer, which then decides what action to take for these files.
There are two optional stages that can be considered before the system takes an action. After the file IDs are returned to the computer, the system can query the file store with similar file IDs. Details of similar files corresponding to each file ID are then returned to the computer, which then determines what to do to the files.
How does TLSH Measure Up?
Legitimate computer programs constantly change—from a series of small updates to significant modifications, including patches, functionality enhancements, and corruption to their files. Any of these alterations can change the program’s hash value. This poses a challenge for most traditional whitelisting applications that compare computer programs with cryptographic hashes, such as SHA256 or MD5, with a database of known good hashes. A threat actor, for instance, can deploy malware by surreptitiously inserting malicious code in a file or program with a hash value that’s similar to its legitimate counterpart.
TLSH addresses these by analyzing the program with a data score of similarity digests and determine that it is a good file (being an exact match). It if is relatively similar to a known, legitimate program, additional tests are run to identify that the new file came from the same trusted source as the original file.
Using similarity digests for whitelisting applications already has traction, thanks to tools like SDHASH and SSDEEP. Both can be used to compute and compare sets of data for similarity of hash values. However, SDHASH and SSDEEP require a very slow and linear search. Based on our tests and analyses, SSDEEP regularly missed detections, and consequently misses identifying updates made to computer programs. SDHASH matches files with 64 bytes sequences in common, allowing it match computer programs that have areas of significant overlap (such as the use of shared libraries), but also substantially different sections.
Additionally, SSDEEP, SDHASH, and other similarity digests don’t have fast search mechanisms; incorporating them in a whitelisting system can be impractical, as it requires comparisons with up to tens of millions of legitimate computer programs. The use of an index and TLSH (with proper distance metric) allowed us to get speeds typical in regular database lookups.
Trend Micro Locality Sensitive Hashing has been demonstrated in Black Hat Asia 2017 as “Smart Whitelisting Using Locality Sensitive Hashing”, on March 30 and 31, in Marina Bay Sands, Singapore. It has also been published in peer-reviewed papers as “TLSH — A Locality Sensitive Hash” and “Using Randomization to Attack Similarity Digests”.
Posted: 30 Mar 2017 | 3:12 am
As numerous studies have shown, smart houses, smart cars, and smart cities are undeniably beneficial to people in everyday life, but quite often can become a threat to their safety. It is not only a matter of personal data leakage. Just imagine that, for example, a smart refrigerator, affected by a third party at one point or another, would begin identifying expired products as fresh. There is yet another more dismal scenario: the system of a smart car turns the vehicle to the right at high speed, catching the driver unaware…
However, both existing and predictable threats that emerge from home IoT devices are only part of the problem related to the infrastructure around us becoming “smarter”. A technological boom in medicine both encouraged medical institutions to use exclusively information systems in processing data and led to the emergence of new types of technological equipment and personal devices that can be used to interact with traditional systems and networks. This means that the threats that are relevant for them can also be relevant for medical systems.
For the medical industry, the main attack vector is related to personal data and information on the health condition of patients. The first step in evaluating the security level for data is identifying entry points within the infrastructure of medical institutions where healthcare data can be collected, stored, and/or taken advantage of by an evildoer.
Possible entry points can be classified as follows:
For the last three classes mentioned above, a detailed first-hand analysis of specific models related to these classes is required. It is for exactly this reason that those devices deserve an article of their own. For now, we will focus on devices and their components that do not require physical access and are frequently accessible from the Internet.
We’ve already written the following about the security of portable devices in March of 2015: “Just imagine, if a fitness tracker with a heart-rate monitor is hacked, then any shop owner will be able to track the heart rate of buyers as they look at discounts in the shop. The influence of advertisements on people can be learned in the same manner. Moreover, a hacked fitness tracker with a heart-rate monitor can be used as a lie detector.”
Owing to the increasing accuracy of sensors, gadgets that collect data on the health condition of their owners can potentially be used in serious ambulatory care to assess a patient’s health. However, the level of security for these gadgets has not been developing as fast as their capabilities.
Tracking vital signs with the help of mobile devices may become an integral part of ambulatory care in the nearest future
Information that is collected by tracking vital signs can be used by both the owner of the device and the vendor of the infrastructure that the tracking app operates on. For users, the heart-rate parameter can signify that a certain activity should be decreased, specific medicines should be taken, etc., while vendors can send collected data to medical companies that can use it to assess the overall health of the client.
Thus, the main advantage of data collected by a gadget is not the depth of its analysis (any medical examination will yield more accurate results than readings from a fitness tracker) but the ability to evaluate changes in a patient’s health condition dynamically. Scenarios for using the information are limited by the imagination and enterprise of the owner, as well as by laws related to personal data.
If we look at the same piece of information from the perspective of a cybercriminal, then an owner of such a device will have not the most favorable outlook – analysis of certain parameters (for example, heart rate, sleep quality, or average ADL score) allows a criminal to gain an overview of a victim’s health. Any additional information may be provided by a gadget that is connected to the mobile device and is capable, for instance, of measuring the blood pressure or blood sugar levels of its user. After making conclusions about the ailments of a victim, an evildoer can provoke their aggravation.
Attacks to obtain health data can be divided into three basic types: those that violate data privacy, those that compromise data integrity, and those that attack data availability. Main vectors can be defined for each of those.
Types of attack that violate the privacy of medical data:
Types of attacks on data integrity:
Attacks on availability:
Entry points for malicious code that commits theft or substitutes data on a mobile device depend on a specific combination of device and software.
Yet, I would like to review another entry point in detail – information systems on a medical institution’s network that are accessible from the Internet.
Medical institutions utilize automated healthcare data storage solutions, which store miscellaneous information about patients (diagnosis results, information about prescribed drugs, medical histories, etc.). The infrastructure of such a system may include various hardware and software components, which can be merged into data storage networks and can be accessible from the Internet in one form or another.
Regarding solutions for storage of healthcare data, several software packages, which can be exploited as entry points into medical infrastructure, can be given as examples.
A key feature of the above-mentioned systems is a web interface (a web app) that is used to control them over the Internet. A web interface may have vulnerabilities that can be exploited by an evildoer, who can gain access to valuable information and processes. It is worth reviewing these systems in detail and verifying whether they are accessible from the Internet, i.e. if they are a potential entry point for evildoers.
In order to evaluate the number of apps that are available from the outside (from the Internet) and can work with EHR, a list of software employed in these tasks should be created and then a dork list should be organized. Dorks are special search-engine queries that are aimed at finding web components of required software among all of the resources indexed by a search engine.
Here is an example of a dork query that uses Google to search for the login form of EHR software components:
intitle:”<vendor_name> Login” & inurl:<vendor name>
The example of a discovered web component (a login form) of software that is intended to work with EHR
It should be noted that some of the resources found in the search results turned out to be traps for evildoers (honeypots). This fact alone indicates that analysts are seeking to track threats related to medical infrastructure. To check if an identified resource is a honeypot, an IP address should be submitted to a special service, HoneyScore, which, by scanning a number of the resource’s attributes (for example, the hosting provider), reaches a verdict on whether or not the resource is a honeypot. Nevertheless, a significant part of the discovered resources is represented by actual systems.
126 discovered resources that meet the search criteria
Each of the discovered web resources is a potential entry point that can be exploited by an evildoer to access the infrastructure. For example, many discovered systems lack protection against an exhaustive password search, which means that a criminal can use brute-force attacks. Then, by using a hacked account, the evildoer can gain privileged access to the system through the interface or find or exploit online vulnerabilities in order to access the system in the future.
An example of a discovered web interface for logging into an EHR system
A “hospital information system” is quite a vast notion that includes a set of methods and technologies for processing medical information. In our case, we are interested only in the HIS components that have a web interface for controlling and visualizing medical information.
Let’s consider the software of OpenEMR as an example. This software is used in medical institutions as a medical-data management solution, and it is certified by the Office of the National Coordinator for Health Information Technology (ONC). Some of its components are written in the PHP programming language, which means that a potential entry point for an evildoer can be a web server that maintains these OpenEMR components.
The next Google dork query returned 106 search results that meet the following criterion:
inurl:”/interface/login/login_frame.php” intitle:”Login” intext:”Username:”
After a quick analysis of the search results, it became obvious that components of the majority of the discovered OpenEMR systems have vulnerabilities, including some critical ones. This means that these vulnerabilities open up the OpenEMR database to being compromised. This comes with the fact that exploits for the discovered vulnerabilities are publicly available.
An example of a vulnerable HIS that was openly exposed
For example, analyzing different software versions revealed that information had been published on the vulnerabilities for the vast majority of software installed on the hosts.
|OpenEMR version||Number of hosts (%)||Availability of public exploits|
|Proprietary (modified) version||8,5||–|
There are at least two types of NAS servers that have been used by medical institutions: dedicated “medical” NAS servers and common ones. While the former have strict security requirements for the data stored on them (for example, compliance with the Health Insurance Portability and Accountability Act), the security of the latter rests on the conscience of their developers and the medical institutions that use this type of NAS in their infrastructure. As a result, non-medical NAS may be left working without any updates for years and thus gather a great number of known vulnerabilities.
A list of dorks should be created to select NAS devices located in medical institutions out of all of the other devices indexed by search engines.
The next query is for the Censys search engine, which specializes in indexing devices with IP addresses and finds all of the devices (workstations, servers, routers, NAS servers, etc.) that belong to companies whose names contain words that directly or indirectly define these companies as medical institutions (“healthcare”, “clinic”, “hospital”, and “medical”):
autonomous_system.organization: (hospital or clinic or medical or healthcare)
The Censys search engine found approximately 21,278 hosts that are related to medical institutions
The Censys report, which is shown below, lists the top 10 countries where these hosts are located.
|United States||18 926|
|Republic of Korea||135|
Afterward, only those hosts that are FTP servers can be taken out from the search results that contain the hosts. In order to do this, the query in the search engine should be more specific and, for example, only the hosts that contain an open FTP port and whose banners contain the “FTP” line should be searched for (this is the information that a server sends to a client during attempts to connect to its port):
(tags: ftp) and autonomous_system.organization: (health or clinic or medical or healthcare)
The search results displayed 1,094 hosts with operational FTP servers, which presumably belong to medical institutions.
Additionally, a list of vendor-specific NAS devices can be obtained from the narrowed-down search results. For this, the typical characteristics of a device must be known. These may be included in responses from services that are active on the device (for example, an FTP-server response to a connection attempt may contain the name of the device and its firmware version). The next query allows for selection of only those hosts that contain the “NAS” line in their banner (generally, several QNAP Systems models have this property) from all found hosts:
(metadata.description: nas) and autonomous_system.organization: (health or clinic or medical or healthcare)
The discovered QNAP Systems NAS servers that belong to medical organizations
A ProFTPd web-server release that has vulnerabilities was installed on each of the found NAS. For this release, there is also publicly available and easily accessible information about its exploits.
The most common type of devices that utilize the DICOM format are PACS servers that print patient images that have been received from other DICOM devices.
It is possible to enter the following primitive query in the Shodan search engine to start searching for DICOM devices:
Accordingly, the search results will display hosts (mostly workstations and servers) that are used in medical institutions for storing and processing patient DICOM images.
The list of hosts that are used to process/store DICOM images
Also, it might be worth searching for diagnostic DICOM workstations, which are dedicated PACS systems used for processing, diagnosing, and visualizing data. As an example, the following query for the Censys search engine can be used:
pacs and autonomous_system.organization: (hospital or clinic or medical or healthcare)
Analysis of the search results may reveal dedicated software for a diagnostic workstation.
The login forms of diagnostic workstations used for visualization of patient data
Aside from that, there are also admin panels used to access DICOM servers in the search results.
A login form for accessing a DICOM server
The systems described above handle valuable medical data. Therefore, security requirements for those systems must be high. However, let’s not forget that besides potential entry points, there are dozens of other points an evildoer can use that are not directly related to medical systems but are located in the infrastructure along with valuable data.
Here are several examples of non-medical systems that can be used as a potential entry point into a computer network with the goal of subsequently moving on to resources where medical information is stored:
Each of the mentioned systems may have a vulnerability that can be taken advantage of by an evildoer in order to gain access to medical infrastructure.
For example, the popularity of the Heartbleed vulnerability can be evaluated. This requires entering the following query into the Censys search engine:
autonomous_system.organization: (hospital or clinic or medical or healthcare) and 443.https.heartbleed.heartbleed_vulnerable: 1
The search engine showed 66 hosts that met the criteria and were potentially vulnerable to Heartbleed. Additionally, this was after the existence of the vulnerability, and its dangers had been given wide coverage by the mass media. Generally speaking, when referring to Heartbleed, it should be noted that the problem is global in nature. According to a report by the founder of Shodan, approximately 200,000 websites still remain vulnerable.
In order keep evildoers from stealing medical data from institutions, we, along with taking essential security measures typical for enterprise infrastructure, recommend doing the following:
Posted: 30 Mar 2017 | 2:31 am
Posted: 19 Mar 2017 | 9:28 pm
It all started with a malicious RTF document attached to an email and a request from reader Chris (thanks for your request and help!) to locate the embedded SWF object since it was believed to contain a hidden PE file.
The RTF document contained a 2012 exploit which is described here. The difference between the two documents is that this one contained a SWF file.
I proceeded to use oletools to search for SWF files using pyxswf.py. Nothing. I then used rtfobj.py to dump all the objects.
I looked through the files and no SWF header. I also used OfficeMalScanner’s rtfscan and got the exact same objects and no SWF. I went back to each of the objects using a hex editor and I find the header…kind of.
The “FWS” header can be translated into hex as 0x465753 but in the file it shows up as “0x04657532”. It’s off by half-a-byte. I wrote a quick program that shifts the file by converting everything to hex, removing the first hex character at the beginning, padding the end with a null, then converting everything back into bytes.
Now I get the Flash file.
Chris suggested I use Didier Steven’s rtfdump.py (with the latest fix — version 0.5) which gets the job done.
Using JPEXS you can see the deobfuscation routine of the embedded binary data.
This embedded SWF file is reusing an exploit from Magnitude EK. You can read about that exploit here btw. No sign of a PE file.
Let me go back to the large object I dumped earlier and try to find the PE file using static analysis. At the top, I notice that this is the same marker as the one identified in the SecureList blog post. Looks like the PE file is here but it’s obfuscated.
I compare this file with the malware that gets dropped by the malicious RTF file and I can see that it lines up exactly and that the null bytes are left intact. Since nulls are present, I can rule out compression and modern encryption. That leaves shift and XOR as a possibility but since they ignored nulls, I can’t easily get the key.
What I need to find is a large contiguous blob without any nulls. Near the bottom of the PE file I come across padding strings. There’s other parts of the file I could use but this makes it quicker.
All I need to do is XOR the plaintext with the obfuscated portion and I can get the key. I use Converter’s Key Search/Convert and paste the values in and I get the result.
Here’s what the result looks like. It looks random but I’m hoping there’s a repeating pattern here. That pattern would represent the XOR key.
Here’s a trick I do to find a pattern. I simply change the dimensions of the Notepad window and watch the pattern emerge. There it is!
So the repeating value appears to be 256 bytes long and looks like this:
Let me test this out with Converter. Ugh, so close!
After analyzing the results, I noticed that if a null character is present then it doesn’t rotate the key for the next loop. So I add a new option to Converter…and it works! It works on the decoy document embedded in this object file as well.
Now I wanted to find the shellcode and verify this using dynamic analysis.
One way is to open the document with Word, dump the memory, then look for that marker. Here’s the shellcode.
And here it is in the RTF document.
If you dissemble this, you find that the first part of the shellcode deobfuscates the second part. XOR’ing the second part with a value of 0xA6 reveals the PE decoding routine. I went ahead and XOR’d it then put everything together in IDA. But let me use a debugger instead…
Ah, it’s not a 256-byte XOR key! You can see that the shellcode deobfuscates the PE file using XOR with a starting value of 0xA8 then incrementing it by 0x07. If there’s a null byte then it skips that byte (and doesn’t increment the value). How simple.
So at the end of all this, it turns out that the 256-byte XOR key found during static analysis is the same result I got dynamically albeit the long way to the solution. Very amusing!
Note: If you look at the XOR key above, you’ll see that 0xAF + 7 = 0xB6 + 7 = 0xBD …etc. And when you get to the end, 0xA8 + 7 = 0xAF.
Update 03/01/2017 – To those who where asking, here’s the PE results from VirusTotal.
Posted: 27 Feb 2017 | 6:05 pm
The optimistic outlook is that the internet of things will be an enabling technology that will help make the people and physical systems of the world — health care, food production, transportation, energy consumption — smarter and more efficient.
The pessimistic outlook? Hackers will have something else to hack. And consumers accustomed to adding security tools to their computers and phones should expect to adopt similar precautions with internet-connected home appliances.
“If we want to put networked technologies into more and more things, we also have to find a way to make them safer,” said Michael Walker, a program manager and computer security expert at the Pentagon’s advanced research arm. “It’s a challenge for civilization.”
To help address that challenge, Mr. Walker and the Defense Advanced Research Projects Agency, or Darpa, created a contest with millions of dollars in prize money, called the Cyber Grand Challenge. To win, contestants would have to create automated digital defense systems that could identify and fix software vulnerabilities on their own — essentially smart software robots as sentinels for digital security.
A reminder of the need for stepped-up security came a few weeks after the Darpa-sponsored competition, which was held in August. Researchers for Level 3 Communications, a telecommunications company, said they had detected several strains of malware that launched attacks on websites from compromised internet-of-things devices.
The post Stepping up security for an Internet-of-Things World appeared first on CyberESI.
Posted: 18 Oct 2016 | 7:42 am