SEO poisoning, also known as search engine poisoning, is an attack method that involves creating web pages packed with trending keywords in an effort to trick search engines to get a higher ranking in search results. There are different ways to implement SEO poisoning, such as keyword stuffing, the use of hidden text, and cloaking, among others. In addition to manipulating search ranking, SEO poisoning is widely used to redirect users to unwanted applications, phishing, exploit kits and malware, porn, advertisements, and so on.
The ThreatLabZ research team has been actively tracking SEO poisoning campaigns; in this blog, we will share some recent examples and an analysis of the techniques used.
“Midterm elections” campaign
Attackers often use holidays and other timely occasions that are likely to generate a lot of search interest. For this analysis, we chose to focus on the upcoming U.S. election. In the following screenshot, there are three SEO poisoned URLs in the Google search result for the keyword “midterm elections.”
Fig. 1: SEO poisoned URLs in Google search
After about a month of looking at this “midterm elections” SEO poisoning campaign, we found more than 10,000 compromised websites with more than 15,000 keywords, and we continue to find hundreds of newly compromised sites involved in this activity every day.
Use of multiple redirects
Let’s take a look at some specific URLs generated by the following SEO poisoning campaign:
The Google cache for the above URL is shown below, and you can see that the Google crawler got a junk page loaded up with many uses of the keyword “midterm elections.”
Fig. 2: Google crawler loaded with keywords
But as we browse this URL in Chrome, we discovered that it may be redirected to this page:
Figure 3: SEO poisoning landing page example
We say “may” because the redirected website is different each time.
We also noted that it goes through a series of redirects before landing on the final page, as shown in figure 4 below. This is just one of the many measures that cybercriminals are using to deter automated crawlers from adding detection for the landing pages.
In our example, the user goes through two redirects via the “302 Found” response code before getting to a real page, as shown in figure 3:
Redirect URL #1 - 5[.]45[.]79[.]15/input/?mark=20180314-landlordpeace.com/0fuq&tpl=9&engkey=how+to+login+to+zscaler
Redirect URL #2 - www[.]hitcpm[.]com/watch?key=027ed88f05536b6c1a41df968c0abb52
Figure 4: The web page content of the last redirect
The final landing page that the user sees will be different every time; in our case the user was served the following web page:
The multiple redirect model provides a perfect platform for a MaaS (Malware-as-a-Service) infrastructure, as it shields the final landing page from automated security crawlers.
The attackers are leveraging cloaking techniques whereby the end user is served different content depending on the HTTP headers involved in the web request. We noticed three distinct responses in some of the recent campaigns:
The attacker distinguishes between user view and crawler view by inspecting the user-agent HTTP header of the request. If the user-agent string belongs to a well-known web browser, then user view content is served.
Without the use of cloaking, the content fetched by the search engine crawler “crawler view” as well as the direct user “direct view” will be identical. However, the SEO page will have scripts to detect whether it is an actual user loading the content in a web browser, in which case the user will be redirected to the final landing page containing the malicious content.
Here is an example of an SEO campaign where cloaking is not being used:
The crawler view and direct view for this SEO URL returns identical content. The SEO page in this case will redirect to a final landing page based on the user’s action, such as mouse movement or rendering of the page in the web browser. The crawler will not see the landing page redirect, as there is usually no user interaction or browser rendering involved.
Below is a view of what happens when a user browses an SEO-poisoned URL that is not leveraging cloaking techniques. The user will see a webpage as well as a busy icon on the browser tab indicating additional background activity. This activity is leading the user to the final landing page in the background as shown in this screen capture from Fiddler (a free web request debugging tool).
Figure 5: An SEO poisoned URL without cloaking leads user to landing page
The attacker is leveraging specially crafted CSS (Cascading Style Sheet) to perform a redirect from the user’s browser. In CSS, the URL property can be used to set the background. The figure below shows the typical usage of the URL property (taken from w3schools.com).
Figure 6: URL property
But, if you don’t give any parameter to the URL property, like url() instead of url(“URL”), it will load the parent page again. During the second loading, however, the referer HTTP header is set to the parent URL itself. This is the reason there are two requests to the same URL in Fiddler. It is important to note that the malicious content will be served on the second request, in which the referer HTTP header is set to the expected URL.
The figure below shows the CSS code snippet used in the SEO page. The line “background-image: url()” will cause the page to reload.
Figure 7: CSS code snippet in the SEO page
The second request will load the malicious code, as shown in the image below.
Figure 8: Malicious code
SEO URL generation
Let’s take a look at a typical SEO URL structure seen in SEO poisoning campaigns:
SEO URL: sbtechsiteleri[.]com/docs/bmfns7.php?gneo=access-vba-form-load
We can divide this URL into several parts:
The campaign uses different parameters to generate URLs. We have found hundreds of unique parameters; jtjd and wanh are two examples of parameters shown in the screenshot below.
From the search result in the screenshot, we can reasonably guess there are hundreds of millions of SEO URLs generated for these two parameters.
Figure 9: URLs generated
SEO web page generation
Although we don’t have access to the backend code used to generate the SEO webpages, we can draw some insights into the generation process based on our analysis of several pages involved in this activity:
The Google cache of the webpage www.sbtechsiteleri[.]com/docs/bmfns7.php?gneo=access-vba-form-load:
Figure 10: Example of Google cache
The first sentence, “I am fairly new to Access,” can be found in several URLs. The second sentence, “Programming Microsoft Access with VBA can be a lot easier if you know the keyboard shortcuts for the most common commands and tasks and the” is from this site:
Figure 11: Example of site found
Following that sentence, you can see, “If you want to set the RecordSource of another form, you must ensure the other form is open first,” which is from this website:
Figure 12: Example of sentence found at site
All three of the above examples are for the keyword “access.”
SEO URLs redirect users to different targets. We saw two modes of operation in the pages that we analyzed:
Here are the top web categories to which the final landing page sites belonged:
1. Adult and pornographic websites
2. Internet services sites; in this case, the SEO campaign's purpose is advertising
3. Politics and religion, an example of which is shown below
4. Exploit servers leading to adware/malware payloads
On an average, we see over 3,000 new and unique SEO poisoned URLs every day. ThreatLabZ is actively tracking this threat and will continue to ensure coverage for Zscaler customers.
Indicators of Compromise
The list of the redirectors used by this campaign and some IOCs for PHP files and ZIP files can be found here. If you find these PHP or ZIP files in your website, it is likely that your website has been compromised.