Jump to content

Search engine optimization

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Projectphp (talk | contribs) at 01:39, 2 November 2006 (To add a be or not to add a be, that is the question ~~~~~). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Search engine optimization (SEO) is a subset of search engine marketing, and deals with improving the number and/or quality of visitors to a web site from "natural" (aka "organic" or "algorithmic" search engine) listings. The term SEO can also refer to "search engine optimizers", an industry of consultants who carry out optimization projects on behalf of clients.

Search engines display different kinds of listings on a search engine results page (SERP), including paid advertising in the form of pay per click advertisements and paid inclusion listings, as well as unpaid organic search results and keyword specific listings, such as links to news stories and images. SEO is concerned with improving the number and position of a site's listings in the organic search unpaid results, i.e. the results that are generated by an algorithm that ranks the relevance of all the pages in its database for specific keywords search.

SEO strategies vary widely, in accordance with a specific site's need. Broadly speaking, SEO may be geared towards increasing either, or both, the total number and quality of visitors from Search Engines, with a quality visitor measured by how often a visitor using a specific keyword leads to a desired conversion action, such as making a purchase or requesting further information.

Search engine optimization can be offerred as a stand-alone service or as a part of a larger marketing effort. As part of SEO deals with making changes to the source code of a site, it is often most effective when incorporated into the initial development and design of a site, leading to the use of the term Search Engine Friendly to describe Content management systems and shopping carts that can be optimised easily or effectively.

There are a range of strategies and techniques that can be employed in SEO, including changes to a site's code (referred to as "on page factors") and getting links from other sites (referred to as "off page factors"). These techinques can be put into two broad categogries: techniques that Search Engines either do not approve of, referred to as spamdexing, or attempt to minimise the impact of, and those techniques that search engines either have no issue with, or are simply part of good design.

Some commentators, and even some SEOs, classify these methods, and the practitioners who utilise them, as either "white hat SEO" (methods generally approved by search engines, such as building content and improving site quality), or "black hat SEO" (tricks such as cloaking and spamdexing that Search Engines do not universally approve of).[1] Other SEOs reject the black and white hat dichotomy as an over-simplification.

SEO, as a marketing strategy, can often generate a good return. However, as the search engines are not paid for the traffic it sends from organic search, the algorithms used can and do change and there are many problems that can cause a Search Engine problems when crawling or ranking a site's pages, there are no guarantees of success, either in the short or long term. Due to this lack of guarantees and certainty, SEO is often compared to traditional Public Relations (PR), with PPC advertising closer to traditional advertising.

History

Early search engines

Webmasters and content providers began optimizing sites for search engines in the mid-1990s, as the first search engines were cataloging the early Web. Initially, all a webmaster needed to do was submit a site to the various engines which would run spiders, programs to "crawl" the site, and store the collected data. The default search-bracket was to scan an entire web page for so-called related search words, so a page with many different words matched more searches, and a web page containing a dictionary-type listing would match almost all searches, limited only by unique names. The search engines then sorted the information by topic, and served results based on pages they had crawled. As the number of documents online kept growing, and more webmasters realized the value of organic search listings, some popular search engines began to sort their listings so they could display the most relevant pages first. This was the start of a friction between search engine and webmasters that continues to this day.

At first search engines were guided by the webmasters themselves. Early versions of search algorithms relied on webmaster-provided information such as category and keyword meta tags, or index files in engines like ALIWEB. Meta-tags provided a guide to each page's content. When some webmasters began to abuse meta tags, causing their pages to rank for irrelevant searches, search engines abandoned their consideration of meta tags and instead developed more complex ranking algorithms, taking into account factors that elevated a limited number of words (anti-dictionary) and were more diverse, including:

A number of attributes within the HTML source of a page were often manipulated by web content providers attempting to rank well in search engines.[2] But by relying so extensively on factors that were still within the webmasters' exclusive control, search engines continued to suffer from abuse and ranking manipulation. In order to provide better results to their users, search engines had to adapt to ensure their SERPs showed the most relevant search results, rather than useless pages stuffed with numerous keywords by unscrupulous webmasters using a bait-and-switch lure to display unrelated web pages. This led to the rise of a new kind of search engine.

Development of more sophisticated ranking algorithms

Google was started by two PhD students at Stanford University, Sergey Brin and Larry Page, and brought a new concept to evaluating web pages. This concept, called PageRank, has been important to the Google algorithm from the start.[3] PageRank relies heavily on incoming links and uses the logic that each link to a page is a vote for that page's value. The more incoming links a page had the more "worthy" it is. The value of each incoming link itself varies directly based on the PageRank of the page it comes from and inversely on the number of outgoing links on that page.

With help from PageRank technology, Google proved to be very good at serving relevant search results. Google quickly became the most popular and successful search engine. Because PageRank measured an off-site factor, Google felt it would be more difficult to manipulate than on-page factors.

However, webmasters had already developed link building tools and schemes to influence the Inktomi search engine. These methods proved to be equally applicable to Google's algorithm. Many sites focused on exchanging, buying, and selling links on a massive scale. PageRank's reliance on the link as a vote of confidence in a page's value was undermined as many webmasters sought to garner links purely to influence Google into sending them more traffic, irrespective of whether the link was useful to human site visitors.

Further complicating the situation, the default search-bracket was still to scan an entire web page for so-called related search-words, and a web page containing a dictionary-type listing would still match almost all searches (except special names) at an even higher priority given by link-rank. Dictionary pages and link schemes could severely skew search results.

It was time for Google — and other search engines — to look at a wider range of off-site factors. There were other reasons to develop more intelligent algorithms. The Internet was reaching a vast population of non-technical users who were often unable to use advanced querying techniques to reach the information they were seeking and the sheer volume and complexity of the indexed data was vastly different from that of the early days. Search engines had to develop predictive, semantic, linguistic and heuristic algorithms. Around the same time as the work that led to Google, IBM had begun work on the Clever Project [4], and Jon Kleinberg was developing the HITS algorithm.

A proxy for the PageRank metric is still displayed in the Google Toolbar, though the displayed value is rounded to be an integer, and the data updated infrequently, so it is likely to be outdated. For these reasons, and the fact that PageRank is only one of more than 100 "signals" that Google considers in ranking pages, experienced SEOs recommend ignoring the displayed PageRank.[5]

Today, most search engines keep their methods and ranking algorithms secret, to compete for finding the most valuable search-results and to deter spam pages from clogging those results. A search engine may use hundreds of factors in ranking the listings on its SERPs; the factors themselves and the weight each carries may change continually. Algorithms can differ widely: a web page that ranks #1 in a particular search engine could rank #200 in another search engine.

Google, Yahoo, Microsoft and Ask.com do not disclose the algorithms they use to rank pages. Some SEOs have carried out controlled experiments to gauge the effects of different approaches to search optimization. Based on these experiments, often shared through online forums and blogs, professional SEOs form a consensus on what methods work best.

SEOs widely agree that the top signals that influence a page's rankings include:[6][7]

  1. Keywords in the title tag.
  2. Keywords in links pointing to the page.
  3. Keywords appearing in visible text.
  4. Link popularity (PageRank for Google) of the page.

In addition, there are many other signals that can affect a page's ranking [8].

The relationship between SEO and the search engines

The first mentions of Search Engine Optimization don't appear on Usenet until 1997, a few years after the launch of the first Internet search engines. The operators of search engines recognized quickly that some people from the webmaster community were making efforts to rank well in their search engines, and even manipulating the page rankings in search results. In some early search engines, such as Infoseek, ranking first was as easy as grabbing the source code of the top-ranked page, placing it on your website, and submitting a URL to instantly index and rank that page.

Due to the high value and targeting of search results, there is potential for an adversarial relationship between search engines and SEOs. In 2005, an annual conference named AirWeb[9] was created to discuss bridging the gap and minimizing the sometimes damaging effects of aggressive web content providers.

Some more aggressive site owners and SEOs generate automated sites or employ techniques that eventually get domains banned from the search engines. Many search engine optimization companies, which sell services, employ long-term, low-risk strategies, and most SEO firms that do employ high-risk strategies do so on their own affiliate, lead-generation, or content sites, instead of risking client websites.

Some SEO companies employ aggressive techniques that get their client websites banned from the search results. The Wall Street Journal profiled a company that allegedly used high-risk techniques and failed to disclose those risks to its clients.[10] Wired reported the same company sued a blogger for mentioning that they were banned.[11] Google's Matt Cutts later confirmed that Google did in fact ban Traffic Power and some of its clients.[12]

Some search engines have also reached out to the SEO industry, and are frequent sponsors and guests at SEO conferences and seminars. In fact, with the advent of paid inclusion, some search engines now have a vested interest in the health of the optimization community. All of the main search engines provide information/guidelines to help with site optimization: Google's, Yahoo!'s, MSN's and Ask.com's. Google has a Sitemaps program [13] to help webmasters learn if Google is having any problems indexing their website and also provides data on Google traffic to the website. Yahoo! has SiteExplorer that provides a way to submit your URLs for free (like MSN/Google), determine how many pages are in the Yahoo! index and drill down on inlinks to deep pages. Yahoo! has an Ambassador Program[14] and Google has a program for qualifying Google Advertising Professionals[15].

Getting into search engines' listings

New sites do not need to be "submitted" to search engines to be listed. However, Google and Yahoo offer a submission program such as Google Sitemaps that an XML type feed could be created and submitted. Generally however, a simple link from an established site will get the search engines to visit the new site and begin to spider its contents. It can take a few days or even weeks from the acquisition of a link from such an established site for all the main search engine spiders to commence visiting and indexing the new site.

Once the search engine finds a new site, it uses a crawler program to retrieve and index the pages on the site. Pages can only be found when linked to with visible hyperlinks. For instance, some search engines do not read links created by Flash or Javascript.

Search engine crawlers may look at a number of different factors when crawling a site, and many pages from a site may not be indexed by the search engines until they gain more PageRank or links or traffic. Distance of pages from the root directory of a site may also be a factor in whether or not pages get crawled, as well as other importance metrics. Cho et al.[16] described some standards for those decisions as to which pages are visited and sent by a crawler to be included in a search engine's index.

Webmasters can instruct spiders not to index certain files or directories through the standard robots.txt file in the root directory of the domain. Standard practice requires a search engine to check this file upon visiting the domain, though a search engine crawler will keep a cached copy of this file as it visits the pages of a site, and may not update that copy as quickly as a webmaster does. The web developer can use this feature to prevent pages such as shopping carts or other dynamic, user-specific content from appearing in search engine results, as well as keeping spiders from endless loops and other spider traps.

For those search engines who have their own paid submission (like Yahoo!), it may save some time to pay a nominal fee for submission. Yahoo!'s paid submission program guarantees inclusion in their search results, but does not guarantee specific ranking within the search results.

White hat methods

White hat methods of involve following the search engines' guidelines as to what is and what isn't acceptable. Their advice generally is to create content for the user, not for the search engines; to make that content easily accessible to their spiders; and to not try to game the system.[17][18][19][20][21] Often, webmasters make critical mistakes when designing or setting up their websites, inadvertently "poisoning" them so that they will not rank well. White hat SEOs attempt to discover and correct mistakes, such as machine-unreadable menus, broken links, temporary redirects, or a poor navigation structure.

Because search engines are text-centric, many of the same methods that are useful for web accessibility are also advantageous for SEO. Google has brought the relationship between SEO and accessibility even closer with the release of Google Accessible Web Search which prioritises accessible websites.

Methods are available for optimizing graphical content, including ALT attributes, and adding a text caption. Even Flash animations can be optimized by designing the page to include alternative content in case the visitor cannot read Flash.

Some methods considered proper by the search engines: [22]

  • Using unique and relevant title to name each page.
  • Editing web pages to replace vague wording with specific terminology relevant to the subject of the page.
  • Providing unique, quality content to address visitor interests.
  • Using an accurate description meta tag to make search listings more informative.
  • Ensuring that all pages are accessible via anchor tag hyperlinks.
  • Allowing search engine spiders to crawl pages without session IDs, cookies, or logging in.
  • Developing "link bait" strategies. High quality websites that offer interesting content or novel features tend to accumulate large numbers of back links.
  • Writing useful, informational articles under a Creative Commons or other open source license, in exchange for attribution to the author by hyperlink.

Black hat methods

"Black hat" SEO are methods to try to improve rankings which are disapproved of by the search engines, typically because they consider such methods deceptive, and unrelated to providing quality content to site visitors. Search engines often penalize sites they discover using black hat methods, by reducing their rankings or eliminating their listings from the SERPs altogether. Such penalties are usually applied automatically by the search engines' algorithms, because the Internet is too large to make manual policing of websites feasible.

Spamdexing is the promotion of irrelevant, chiefly commercial, pages through deceptive techniques and the abuse of the search algorithms. Over time a widespread consensus has developed in the industry as to what are and are not acceptable means of boosting one's search engine placement and resultant traffic.

Spamdexing often gets confused with white hat search engine optimization techniques, which do not involve deceit. Spamming involves getting websites more exposure than they deserve for their keywords, leading to unsatisfactory search results. Optimization involves getting websites the rank they deserve on the most targeted keywords, leading to satisfactory search experiences.

When discovered, search engines may take action against those found to be using unethical SEO methods. In February 2006, Google removed both BMW Germany and Ricoh Germany for use of these practices.[23]

Cloaking is the practice of serving one version of a page to search engine spiders/bots and another version to human visitors.

SEO and marketing

There is a considerable sized body of practitioners of SEO who see search engines as just another visitor to a site, and try to make the site as accessible to those visitors as to any other who would come to the pages. They often see the white hat/black hat dichotomy mentioned above as a false dilemma. The focus of their work isn't primarily to rank the highest for certain terms in search engines, but rather to help site owners fulfill the business objectives of their sites. Indeed, ranking well for a few terms among the many possibilities does not guarantee more sales. A successful Internet marketing campaign may drive organic search results to pages, but it also may involve the use of paid advertising on search engines and other pages, building high quality web pages to engage and persuade, addressing technical issues that may keep search engines from crawling and indexing those sites, setting up analytics programs to enable site owners to measure their successes, and making sites accessible and usable.

SEOs may work in-house for an organization, or as consultants, and search engine optimization may be only part of their daily functions. Often their education of how search engines function come from interacting and discussing the topics on forums, through blogs, at popular conferences and seminars, and by experimentation on their own sites. There are few college courses that cover online marketing from an ecommerce perspective that can keep up with the changes that the web sees on a daily basis.

While endeavoring to meet the guidelines posted by search engines can help build a solid foundation for success on the web, such efforts are only a start. Many see search engine marketing as a larger umbrella under which search engine optimization fits, but it's possible that many who focused primarily on SEO in the past are incorporating more and more marketing ideas into their efforts, including public relations strategy and implementation, online display media buying, web site transition SEO, web trends data analysis, HTML E-mail campaigns, and business blog consulting making SEO firms more like an ad agency.

In 2002, SearchKing filed suit in an Oklahoma court against the search engine Google. SearchKing's claim was that Google's tactics to prevent spamdexing constituted an unfair business practice. This may be compared to lawsuits which email spammers have filed against spam-fighters, as in various cases against MAPS and other DNSBLs. In January of 2003, the court pronounced a summary judgment in Google's favor. [24] In March 2006, KinderStart.com, LLC filed a first amended complaint against Google and also attempts potential members of the class of plaintiffs to join the class action. [25] The plaintiff's web site was removed from Google's index prior to the lawsuit and the amount of traffic to the site plummeted.

References

  1. ^ Goodman, Andrew, SearchEngineWatch, Search Engine Showdown: Black Hats vs. White Hats at SES
  2. ^ What is a tall poppy among web pages?, Proceedings of the seventh conference on World Wide Web, Brisbane, Australia, 1998, written by Pringle, G., Allison, L., and Dowe, D.
  3. ^ Brin, Sergei and Page, Larry, The Anatomy of a Large-Scale Hypertextual Web Search Engine, Proceedings of the seventh international conference on World Wide Web 7, 1998, Pages: 107-117
  4. ^ The Clever Project, May 4, 2006
  5. ^ WebmasterWorld.com- search engine optimization forum
  6. ^ Search Engine Watch - Search Engine News and Forums. Organizer of SES (Search Engine Strategies) Conferences.
  7. ^ Highrankings Forum - forum for experienced SEOs and those new to the field
  8. ^ Information Retrieval Based on Historical Data, Google Patent Application, October 10, 2005
  9. ^ AirWeb Adversarial Information Retrieval on the Web, annual conference and workshop for researchers and professionals
  10. ^ Startup Journal (Wall Street Journal), 'Optimize' Rankings At Your Own Risk by By David Kesmodel at The Wall Street Journal Online, September 9 2005
  11. ^ Wired Magazine, Legal Showdown in Search Fracas, Sep, 08, 2005, written by Adam L. Penenberg
  12. ^ Cutts, Matt, Confirming a penalty, published on 2006-02-02 at Matt Cuts Blog
  13. ^ Google Web Master Central, formerly known as Google Sitemaps
  14. ^ Ambassador Program by Yahoo! Search Marketing
  15. ^ Google Advertising Professionals, a Program by Google AdWords, Google's Pay-Per-Click Advertising Service
  16. ^ Efficient crawling through URL ordering by Cho, J., Garcia-Molina, H. , 1998, published at "Proceedings of the seventh conference on World Wide Web", Brisbane, Australia
  17. ^ Ask.com Editorial Guidelines
  18. ^ Google's Guidelines on SEOs
  19. ^ Google's Guidelines on Site Design
  20. ^ MSN Search Guidelines for successful indexing
  21. ^ Yahoo! Search Content Quality Guidelines
  22. ^ Whalen, Jill, HighRankings Forum, Ten Tips to the Top of the Search Engines
  23. ^ Ramping up on international webspam by Matt Cutts, published February 4, 2006, at Matt Cutts Blog
  24. ^ [1] Google replies to SearchKing lawsuit, James Grimmelmann at LawMeme (research.yale.edu), January 09. 2006
  25. ^ [2] (PDF) KinderStart.com, LLC, et al v. Google, Inc., C 06-2057 RS, filed March 17, 2006 in the Northern District of California, San Jose Division.


See also

Google Engineers