The web is a big place — and thanks to the dynamic nature of
Web 2.0 applications and user-contributed content it grows bigger by the minute.
According to Netcraft's latest survey, there are over 127 million active
websites.
While most malware in the past was distributed via email,
more recently the web has been the primary attack vector used by malware
authors to distribute malicious code. The mushrooming of the web and the
browser's status as a critical tool has made it a prime target for cyber criminals.
In a recent study, Google reported that in an in-depth analysis of 4.5 million websites
over a 12-month period, it discovered 450,000 sites were successfully launching
drive-by-downloads of malware code. A University of Washington
study found that of all the sites with downloadable content they examined, six
percentcontained malware.
The most deployed gateway web products today are used for URL
filtering, which are in use in up to 95 percent of enterprise networks. URL
filtering products were originally designed to increase employee productivity
and limit legal liabilities by enforcing acceptable use policies for the web.
As trojans, keystroke loggers, rootkits and other web
malware have become a major enterprise security issue, URL filtering companies
have sought to remake themselves into security companies, and their URL
filtering products have been repositioned from web productivity solutions to web security
solutions. Products often claim comprehensive protection from web malware
through their sizable URL databases and offer regular updates that are
downloaded daily or more frequently from a central server.
But it has become glaringly apparent that URL filtering,
which, after all, was not designed to stop malware, cannot keep pace with
today's web threat environment.
URL filtering can only be as effective as its database of
categorized web sites. URL filtering solutions rely on visiting each URL or “crawling”
the web in an effort to inventory “bad” sites. In a recent ad, a leading URL filtering vendor
claimed it crawled 80 million websites a day. While this sounds like a big
number, keep in mind that it still leaves 45 million sites unexamined.
More troubling is that these figures don't take into account
Web 2.0 websites that are powered by third-party and user-contributed content
and as a result are constantly changing. For example, MySpace alone has 100
million accounts, each account with several different web pages and Wikipedia
is a site with more than 7.9 million individual articles. So crawling 80 million
websites per day is really just a drop in the bucket and leaves a lot of the web
uncategorized. More importantly, if you are relying on a URL database, your
users may be exposed to the threats that reside on “good” sites if there is a
gap between when they were last scanned and when the malware is posted on the
site.
According to Gartner, “URL filtering suffers a fundamental
flaw to be an effective security filter: It does not monitor threats in real
time.”
In fact, a third-party test using 200 known spyware samples
revealed that an enterprise URL filtering product significantly underperformed
a signature-based gateway anti-malware scanner, missing 31 percent of the
keystroke loggers, browser hijackers, adware and other malicious code.
The onlyway to
ensure that users are not infected by malware is to scan all content in
real-time.
Using URL filtering to defend against malware is like
reading yesterday's newspaper to find the current price of your favorite stock.
So why are most companies still relying on URL filtering to deliver
protection from malware?
There was a time when most malware resided on suspect URLs,
like porn or gambling sites. So deploying a URL filter and blocking user access
to dodgy sites might have offered some protection from web-based malware. That
day has come and gone.
Web threats are no longer restricted to dodgy sites. In
today's web security world, threats are just as likely to be found on
reputable, trusted sites.
In fact, the past year there have been countless incidences
of legitimate sites being found to host malware. MySpace, the Miami Dolphins
website, Wikipedia and the Samsung website have all been contaminated with
malware.
The decentralization of website content has made it easy for
cybercriminals to inject malware onto unsuspecting sites. Malware is being
inserted on web pages via insecure ad servers, compromised hosting networks,
user-contributed content, and even through third-party widgets, commonly found
on many legitimate sites.
In May, the Tom's Hardwarewebsite, a popular technical product review site visited by thousands of
tech savvy users, unknowingly hosted a malware infected ad which used the
animated cursor (ANI) vulnerability to spread a trojan.
Of even more concern is that cybercriminals are using very
sophisticated tactics to seed malware on all types of sites.
In late June, a fast-flux
network, a disturbing advance in the development and use of bot networks, was
used to spread malware via a flash movie on MySpace. Possibly 100,000 MySpace
accounts were affected by the attack. In effect, this MySpace attack in June
was a double-whammy, combining the insecurities inherent in many Web 2.0 sites
with a powerful, new and incredibly stealthy distribution technique.
Unlike traditional “bot” networks, fast fluxnetworks abuse DNS to dynamically
resolve an address to any number of infected PCs, as well as using the same
technique to hide the control servers, which make them much harder to shut
down. This high-tech game of Whack-a-Mole ensures that the offending site(s)
are active for a much longer period of time.
If URL filtering provides good policy enforcement but not
security, what does?
It's simple: Real-time scanning of web traffic is the only
true defense against malware.
There's been a lot of hype surrounding real-time scanning of
web traffic, but what does it mean and what does it need to encompass in order
to be an effective defense against web-based malware?
First and foremost, real-time scanning means that all
content on a URL is scanned in real-time every time it is requested. This is an
important distinction from URL filtering — which merely filters URLs and
compares them to a limited database of known categorized URLs.
Effective real-time scanning should be powered by a
combination of multiple detection technologies — which when used on their own
to combat malware, can often fall short. However, when these techniques are
combined in a cocktail approach, their strengths are leveraged and their
shortcomings are mitigated.
Signature-based
detection: Signature-based engines are extremely effective at identifying
and blocking known threats. Multiple signature-based engines form an important
part of a multi-layered cocktail approach to real time scanning.
However, signature-based malware detection only works for
known malware. It is not useful for new threats. Additionally, in order to be
effective signatures must be delivered quickly and propagated — a time
consuming task.
Heuristics: Using
a rule of thumb to detect variants of
known malware is an effective tool in the fight against malware. However, if
your heuristics are too aggressive, you experience false positives. Also,
heuristics are designed to increase the probability of detecting something that
is similar to something that you have seen before. This means that a heuristic
won't detect completely novel malware.
Code Analysis: The
behavior of the code can be determined by modeling program logic, behavioral
rules, and contextual system call analysis techniques that suggest good or bad
intentions.
Code reputation: Unlike
URLs whose content can change, a binary can, in fact, have a reputation based
on historical analysis. “Good” code can be treated differently than unknown or
bad code.
URL Reputation: URL
reputation is derived by examining parameters such as IP address information,
country of the web server, history and age of the URL, domain registration
information, network owner information, URL categorization information, and
types of content present.
URL reputation provides a “credit history” of sorts for a
URL, but it does not provide current information about the safety of a URL. When
looking at web safety, it is useful to remember what you learned in Investment
101: “Past performance does not predict future performance.” As we've seen,
“good” websites today may host malware tomorrow. In the Web 2.0 world there are
few examples of “good” websites that are guaranteed to be good forever.
Using URL reputation alone to defend against malware is like
trying to know if it will rain today by checking to see if it rained yesterday.
Traffic Behavioral Analysis:
Traffic behavior analysis identifies suspicious, atypical traffic that would
suggest, for example, a new phishing scam or perhaps active malware communications
from an infected notebook computer to a command-and-control computer.
Unlike reputation techniques, which are based on past
behavior and provide valuable historical context, actively monitoring web-traffic
patterns and anomalies provides a real-time look into emerging threats.Behavioral analysis of traffic, however, is only effective
if it is based on a large volume of real world traffic.
Using this cocktail of threat detection techniques in
real-time provides a 360-degree view of the current web threat environment
compared to the limited view you get when relying solely on URL filtering. It's
the difference between seeing the full picture and just seeing one piece of the
puzzle.
The 360-degree view of web threats that real time scanning
delivers also allows the dots to be connected. Simply crawling the web for
dangerous sites yields a random collection of bad sites that are seemingly
unrelated. However, relying on multiple techniques – including those described
above, real time scanning can provide critical information on the source of
malware infection.
- Dan Nadir is vice president of product strategy at ScanSafe.