Search Engine News: 2012

Tuesday, January 31, 2012

Blogspot.com is redirecting to Blogspot.in

Now onwards Blogger blogs will redirect to country level TLD extension. Usually I read Google webmaster central blog, Google blog, Gmail blog etc to know about latest updates from Google. Today i.e. Jan 31st 2012, I observed that "Blogspot.com is automatically redirecting to Blogspot.in". As I live in india, it's redirecting to ".in". It might redirect to .co.uk, if I live in UK.

Here is the official information from Google regarding this change - Blogspot.com is redirecting to country specific URL

Points to know regarding this change:

1. Duplicate content issue is the 1st thing we notice in this case. However Google is stating that "rel=canonical" tag will be used across all country level extensions and their team is trying to make less negative impact on search results.

2. Google will receive so many requests to remove content from few blogs. So they would like to manage country wise removal of content . Few countries may not accept some content, but other countries will. Through this latest update, Content removed as per a country’s law will only be removed from the relevant ccTLD and available for other countries.

3. Custom domains will not see any affect. Free blogspot sites will just redirect to country wise extension, remaining all same.

4. If visitors would like to visit non-country specific version, Here is the format: http://domain.blogspot.com/NCR

Friday, January 20, 2012

Google Forecloses On Content Farms With “Panda” Algorithm Update

In January, Google promised that it would take action against content farms that were gaining top listings with "shallow" or "low-quality" content. Now the company is delivering, announcing a change to its ranking algorithm designed take out such material.

New Change Impacts 12% Of US Results
The new algorithm — Google’s "recipe" for how to rank web pages — starting going live yesterday, the company told me in an interview today.

Google changes its algorithm on a regular basis, but most changes are so subtle that few notice. This is different. Google says the change impacts 12% (11.8% is the unrounded figure) of its search results in the US , a far higher impact on results than most of its algorithm changes. The change only impacts results in the US. It may be rolled out worldwide in the future.

While Google has come under intense pressure in the past month to act against content farms, the company told me that this change has been in the works since last January.

Officially, Not Aimed At Content Farms
Officially, Google isn’t saying the algorithm change is targeting content farms. The company specifically declined to confirm that, when I asked. However, Matt Cutts — who heads Google’s spam fighting team — told me, "I think people will get the idea of the types of sites we’re talking about."

Well, there are two types of sites "people" have been talking about in a way that Google has noticed: "scraper" sites and "content farms." It mentioned both of them in a January 21 blog post:

We’re evaluating multiple changes that should help drive spam levels even lower, including one change that primarily affects sites that copy others’ content and sites with low levels of original content. We’ll continue to explore ways to reduce spam, including new ways for users to give more explicit feedback about spammy and low-quality sites.
As “pure webspam” has decreased over time, attention has shifted instead to “content farms,” which are sites with shallow or low-quality content.

I’ve bolded the key sections, which I’ll explore more next.

The “Scraper Update”
About a week after Google’s post, Cutts confirmed that an algorithm change targeting “scraper” sites had gone live:

This was a pretty targeted launch: slightly over 2% of queries change in some way, but less than half a percent of search results change enough that someone might really notice. The net effect is that searchers are more likely to see the sites that wrote the original content rather than a site that scraped or copied the original site’s content.
“Scraper” sites are those widely defined as not having original content but instead pulling content in from other sources. Some do this through legitimate means, such as using RSS files with permission. Others may aggregate small amounts of content under fair use guidelines. Some simply “scrape” or copy content from other sites using automated means — hence the “scraper” nickname.

In short, Google said it was going after sites that had low-levels of original content in January and delivered a week later.

By the way, sometimes Google names big algorithm changes, such as in the case of the Vince update. Often, they get named by WebmasterWorld, where a community of marketers watches such changes closely, as happened with last year’s Mayday Update.

In the case of the scraper update, no one gave it any type of name that stuck. So, I’m naming it myself the “Scraper Update,” to help distinguish it against the “Farmer Update” that Google announced today.

But “Farmer Update” Really Does Target Content Farms
“Farmer Update?” Again, that’s a name I’m giving this change, so there’s a shorthand way to talk about it. Google declined to give it a public name, nor do I see one given in a WebmasterWorld thread that started noticing the algorithm change as it rolled out yesterday, before Google’s official announcement.

Postscript: Internally, Google told me this was called the “Panda” update, but they didn’t want that on-the-record when I wrote this original story. About a week later, they revealed the internal name in a Wired interview. “Farmer” is used through the rest of this story, though the headline has been changed to “Panda” to help reduce future confusion.
How can I say the Farmer Update targets content farms when Google specifically declined to confirm that? I’m reading between the lines. Google previously had said it was going after them.

Since Google originally named content farms as something it would target, you’ve had some of the companies that get labeled with that term push back that they are no such thing. Most notable has been Demand Media CEO Richard Rosenblatt, who previously told AllThingsD about Google’s planned algorithm changes to target content farms:

It’s not directed at us in any way.
I understand how that could confuse some people, because of that stupid “content farm” label, which we got tagged with. I don’t know who ever invented it, and who tagged us with it, but that’s not us…We keep getting tagged with “content farm”. It’s just insulting to our writers. We don’t want our writers to feel like they’re part of a “content farm.”
I guess it all comes down to what your definition of a “content farm” is. From Google’s earlier blog post, content farms are places with “shallow or low quality content.”

In that regard, Rosenblatt is right that Demand Media properties like eHow are not necessarily content farms, because they do have some deep and high quality content. However, they clearly also have some shallow and low quality content.

That content is what the algorithm change is going after. Google wouldn’t confirm it was targeting content farms, but Cutts did say again it was going after shallow and low quality content. And since content farms do produce plenty of that — along with good quality content — they’re being targeted here. If they have lots of good content, and that good content is responsible for the majority of their traffic and revenues, they’ll be fine. In not, they should be worried.

More About Who’s Impacted
As I wrote earlier, Google says it has been working on these changes since last January. I can personally confirm that several of Google’s search engineers were worrying about what to do about content farms back then, because I was asked about this issue and thoughts on how to tackle it, when I spoke to the company’s search quality team in January 2010. And no, I’m not suggesting I had any great advice to offer — only that people at Google were concerned about it over a year ago.

Since then, external pressure has accelerated. For instance, start-up search engine Blekko blocked sites that were most reported by its users to be spam, which included many sites that fall under the content farm heading. It gained a lot of attention for the move, even if the change didn’t necessarily improve Blekko’s results.

In my view, that helped prompt Google to finally push out a way for Google users to easily block sites they dislike from showing in Google’s results, via Chrome browser extension to report spam.

Cutts, in my interview with him today, made a point to say that none of the data from that tool was used to make changes that are part of the Farmer Update. However, he went on to say that of the top 50 sites that were most reported as spam by users of the tool, 84% of them were impacted by the new ranking changes. He would not confirm or deny if Demand’s eHow site was part of that list.

“These are sites that people want to go down, and they match our intuition,” Cutts said.

In other words, Google crafted a ranking algorithm to tackle the “content farm problem” independently of the new tool, it says — and it feels like tool is confirming that it’s getting the changes right.

The Content Farm Problem

By the way, my own definition of a content farm that I’ve been working on is like this:

1. Looks to see what are popular searches in a particular category (news, help topics)
2. Generates content specifically tailored to those searches
3. Usually spends very little time and or money, even perhaps as little as possible, to generate that content
The problem I think content farms are currently facing is with that last part — not putting in the effort to generate outstanding content.

For example, last night I did a talk at the University Of Utah about search trends and touched on content farm issues. A page from eHow ranked in Google’s top results for a search on “how to get pregnant fast,” a popular search topic. The advice:

The class laughed at the “Enjoyable Sex Is Key” advice as the first tip for getting pregnant fast. Actually, the advice that you shouldn’t get stressed makes sense. But this page is hardly great content on the topic. Instead, it seems to fit the “shallow” category that Google’s algorithm change is targeting. And the page, there last night when I was talking to the class, is now gone.

Perhaps the new “curation layer” that Demand talked about in it earnings call this week will help in cases like these. Demand also defended again in that call that it has quality content.

Will the changes really improve Google’s results? As I mentioned, Blekko now automatically blocks many content farms, a move that I’ve seen hailed by some. What I haven’t seen is any in-depth look at whether what remains is that much better. When I do spot checks, it’s easy to find plenty of other low quality or completely irrelevant content showing up.

Cutts tells me Google feels the change it is making does improve results according to its own internal testing methods. We’ll see if it plays out that way in the real world.

Why Google Panda Is More A Ranking Factor Than Algorithm Update

With Google Panda Update 2.2 upon us, it’s worth revisiting what exactly Panda is and isn’t. Panda is a new ranking factor. Panda is not an entirely new overall ranking algorithm that’s employed by Google. The difference is important for anyone hit by Panda and hoping to recover from it.

Google’s Ranking Algorithm & Updates
Let’s start with search engine optimization 101. After search engines collect pages from across the web, they need to sort through them in demand to searches that are done. Which are the best? To decide this, they employ a ranking algorithm. It’s like a recipe for cooking up the best results.

Like any recipe, the ranking algorithm contains many ingredients. Search engines look at words that appear on pages, how people are linking to pages, try to calculate the reputation of websites and more. Our Periodic Table Of SEO Ranking Factors explains more about this.

Google is constantly tweaking its ranking algorithm, making little changes that might not be noticed by many people. If the algorithm were a real recipe, this might be like adding in a pinch more salt, a bit more sugar or a teaspoon of some new flavoring. The algorithm is mostly the same, despite the little changes.

From time-to-time, Google does a massive overhaul of its ranking algorithm. These have been known as "updates" over the years. "Florida" was a famous one from 2003; the Vince Update hit in 2009; the Mayday Update happened last year.

Index & Algorithm Updates
Confusingly, the term “updates” also gets used for things that are not actual algorithm updates. Here’s some vintage Matt Cutts on this topic. For example, years ago Google used to do an “index update” every month or so, when it would suddenly dump millions of new pages it had found into its existing collection.

This influx of new content caused ranking changes that could take days to settle down, hence the nickname of the "Google Dance." But the changes were caused by the algorithm sorting through all the new content, not because the algorithm itself had changed.

Of course, as said, sometimes the core ranking algorithm itself is massively altered, almost like tossing out an old recipe and starting from scratch with a new one. These "algorithm updates" can produce massive ranking changes. But Panda, despite the big shifts it has caused, is not an algorithm update.

Instead, Panda — like PageRank — is a value that feeds into the overall Google algorithm. If it helps consider it as if every site is given a PandaRank score. Those low in Panda come through OK; those high get hammered by the beast.

Calculating Ranking Factors
So where are we now? Google has a ranking algorithm, a recipe that assesses many factors to decide how pages should rank. Google can — and does — change some parts of this ranking algorithm and can see instant (though likely minor) effects by doing so. This is because it already has the values for some factors calculated and stored.

For example, let’s say Google decides to reward pages that have all the words someone has searched for appearing in close proximity to each other. It decides to give them a slightly higher boost than in the past. It can implement this algorithm tweak and see changes happen nearly instantly.

This is because Google’s has already gathered all the values relating to this particular factor. It already has stored the pages and made note of where each word is in proximity to other words. Google can turn the metaphorical proximity ranking factor dial up from say 5 to 6 effortlessly, because those factors have already been calculated as part of an ongoing process.

Automatic Versus Manual Calculations
Other factors require deeper calculations that aren’t done on an ongoing basis, what Google calls “manual” updates. This doesn’t mean that a human being at Google is somehow manually setting the value of these factors. It means that someone decides its time to run a specific computer program to update these factors, rather than it just happening all the time.

For example, a few years ago Google rolled out a "Google Bomb" fix. But then, new Google Bombs kept happening! What was up with that? Google explained that there was a special Google Bomb filter that would periodically be run, since it wasn’t needed all the time. When the filter ran, it would detect new Google Bombs and defuse those.

In recipe terms, it would be as if you were using a particular brand of chocolate chips in your cookies but then switched to a different brand. You’re still "inputting" chocolate chips, but these new chips make the cookies taste even better (or so you hope).

NOTE: In an earlier edition of this story, I’d talked about PageRank values being manually updated from time-to-time. Google’s actually said they are constantly being updated. Sorry about any confusion there.

The Panda Ranking Factor
Enter Panda. Rather than being a change to the overall ranking algorithm, Panda is more a new ranking factor that has been added into the algorithm (indeed, on our SEO Periodic Table, this would be element Vt, for Violation: Thin Content).

Panda is a filter that Google has designed to spot what it believes are low-quality pages. Have too many low-quality pages, and Panda effectively flags your entire site. Being Pandified, Pandification — whatever clever name you want to call it — doesn’t mean that your entire site is out of Google. But it does mean that pages within your site carry a penalty designed to help ensure only the better ones make it into Google’s top results.

At our SMX Advanced conference earlier this month, the head of Google’s spam fighting team, Matt Cutts, explained that the Panda filter isn’t running all the time. Right now, it’s too much computing power to be running this particular analysis of pages.

Instead, Google runs the filter periodically to calculate the values it needs. Each new run so far has also coincided with changes to the filter, some big, some small, that Google hopes improves catching poor quality content. So far, the Panda schedule has been like this:

1. Panda Update 1.0: Feb. 24, 2011
2. Panda Update 2.0: April 11, 2011 (about 7 weeks later)
3. Panda Update 2.1: May 10, 2011 (about 4 weeks later)
4. Panda Update 2.2: June 16, 2011 (about 5 weeks later)

Recovering From Panda
For anyone who was hit by Panda, it’s important to understand that the changes you’ve made won’t have any immediate impact.

For instance, if you started making improvements to your site the day after Panda 1.0 happened, none of those would have registered for getting you back into Google’s good graces until the next time Panda scores were assessed — which wasn’t until around April 11.

With the latest Panda round now live, Google says it’s possible some sites that were hit by past rounds might see improvements, if they themselves have improved.

The latest round also means that some sites previously not hit might now be impacted. If your site was among these, you’ve probably got a 4-6 week wait until any improvements you make might be assessed in the next round.

If you made changes to your site since the last Panda update, and you didn’t see improvements, that doesn’t necessarily mean you’ve still done something wrong. Pure speculation here, but part of the Panda filter might be watching to see if a site’s content quality looks to have improved over time. After enough time, the Panda penalty might be lifted.

Takeaways
In conclusion, some key points to remember:

Google makes small algorithm changes all the time, which can cause sites to fall (and rise) in rankings independently of Panda.

Google may update factors that feed into the overall algorithm, such as PageRank scores, on an irregular basis. Those updates can impact rankings independently of Panda.

So far, Google has confirmed when major Panda factor updates have been released. If you saw a traffic drop during one of these times, there’s a good chance you have a Panda-related problem.

Looking at rankings doesn’t paint an accurate picture of how well your site is performing on Google. Look at the overall traffic that Google has sent you. Losing what you believe to be a key ranking might not mean you’ve lost a huge amount of traffic. Indeed, you might discover that in general, you’re as good as ever with Google.

Wednesday, January 4, 2012

Latent Semantic Indexing

"LSI (Latent Semantic Indexing) is an ethnic way to get higher search engine placement with the use of synonyms rather than keyword density."

The process of retrieving relevant words or information from the content of your website is known as Latent Semantic Indexing (LSI). It has been a very remarkable topic in Information Retrieval System (IR system). Top search engines like Google, Yahoo and Bing work on latent semantic indexing system.

LSI is the different way of optimizing website content for various search engines using different and related keywords rather than exact keywords. It emphasis on use of main keywords a few times on the piece but there is no limit use of related words or key phrases in the content. So it is very important to write relevant information using relevant words on the web content. You may say it; LSI is a way for creating contextual web content that will be indexed by search engines and is optimized for giving incredible position in search results. It is different from earlier search engine optimization techniques in which keywords or key phrases are focused for website optimization.

Latent Semantic Indexing is a concept not a technique. You need to develop this concept in your mind while writing web content. You cannot fool search engines by inserting repetitive keywords without any other contextual content. Best SEO Copywriting Or LSI Copywriting is to write naturally about your subject not using only keywords and keywords. This unique SEO concept focuses on use of a series of relevant words to create SEO friendly contextual content. These words related to the primary keywords or key phrases that the webpage is being optimized for. The concept strictly focuses on avoiding repeating same keywords a specific number of times on the piece of web content. The related words should be used to avoid repetition. It is clear that, LSI is not a technique; it is common sense and about natural writing practice for the web.

Thanks to LSI SEO Concept. The concept is helping optimizers and SEO Copywriters to approach huge success in the field of internet marketing.

We at e Trade Services offer LSI SEO Services. We have expert professional content writers and search engine optimizers who understand well the concept of Latent Semantic Indexing.