Semantic Web, SEO and Google Rich Snippets

Semantic Web, SEO and Google Rich Snippets8

A selection of relevant articles if you want to add semantic markups to your website for some obvious SEO reasons
www.readwriteweb.com

How Best Buy is Using The Semantic Web

Sliced by
-
07-07-2010 07-07-2010
-
www.readwriteweb.com Yesterday we wrote about the increasing usage of Semantic Web technologies by large commercial companies like Facebook, Google and Best Buy. The Semantic Web is a Web of ...

Myers told us that the primary goal of using semantic technologies was to increase the visibility of its products and services. And with data such as store name, address, store hours and GEO data being marked up using RDFa, search engines are now able to identify each of those data components more easily and put them into context.

A quick refresher on the terminology: just as the lingua franca of the Web is HTML (Hypertext Markup Language), RDF (Resource Description Framework) is commonly thought of as the primary language of the Semantic Web. RDFa is a kind of 'lite' version of RDF, which adds metadata to HTML (or XHTML) webpages.

The process of adding RDFa to Best Buy's webpages began two years ago, when the company began to look for ways to get more visibility to its stores on the Web. "At that time," said Myers, "it was difficult for users to find basic store information like store location and hours."

To solve this dilemma, Best Buy gave each store its own blog.

Best Buy employees entered information into the blogs every day, using online forms that output RDFa. Myers told us that the use of RDFa makes "human input from our store employees more visible on the Web."

Best Buy is using Good Relations, a Semantic Web vocabulary for e-commerce that describes product, price, and company data.

Myers remarked that "there isn't a noticeable difference" to the users of Best Buy's website, however all of the RDFa data is very visible to humans via search engine results and its store locator tool. The RDFa data is "also great for machines," said Myers, which has resulted in "a definite up tick in the amount of search traffic to these pages." At last week's SemTech conference, Myers said that it had resulted in a 30% increase in search traffic. He noted that Best Buy hadn't expected to see an SEO benefit, but it's been a boon to them since the company is "very reliant on search engines" for product discovery and store locations.

With Jay Myers at the development wheel, Best Buy's web presence will continue to be enhanced by the Semantic Web. RDFa can ultimately create rich relationships between products, which will in turn "create a deeper visibility to additional products" when a customer is shopping.

That seems like a distinct competitive advantage for Best Buy.


Please login to comment
  • No comments yet.
radar.oreilly.com

Google's Rich Snippets and the Semantic Web - O'Reilly Radar

Sliced by
-
07-07-2010 07-07-2010
-
radar.oreilly.com There's a long-time debate between those who advocate for semantic markup, and those who believe that machine learning will eventually get us to the holy grail of a Semantic Web, one in which computer programs actually understand the meaning of what they see and read. Google has of course been the great proof point of the power of machine learning algorithms. Earlier this week, Google made a nod to the other side of the debate, introducing a feature that they call

Google's Rich Snippets and the Semantic Web

by Tim O'Reilly | @timoreillycomments: 18

There's a long-time debate between those who advocate for semantic markup, and those who believe that machine learning will eventually get us to the holy grail of a Semantic Web, one in which computer programs actually understand the meaning of what they see and read. Google has of course been the great proof point of the power of machine learning algorithms.

Earlier this week, Google made a nod to the other side of the debate, introducing a feature that they call "Rich Snippets." Basically, if you mark up pages with certain microformats ( and soon, with RDFa), Google will take this data into account, and will provide enhanced snippets in the search results. Supported microformats in the first release include those for people and for reviews.

So, for example, consider the snippet for the Yelp review page on the Slanted Door restaurant in San Francisco:

slanteddoor.png

The snippet is enhanced to show the number of reviews and the average star rating, with a snippet actually taken from one of the reviews. By contrast, the Citysearch results for the same restaurant are much less compelling:

citysearch.jpg

(Yelp is one of Google's partners in the rollout of Rich Snippets; Google hopes that others will follow their lead in using enhanced markup, enabling this feature.)

Rich snippets could be a turning point for the Semantic Web, since, for the first time, they create a powerful economic motivation for semantic markup. Google has told us that rich snippets significantly enhance click-through rates. That means that anyone who has been doing SEO is now going to have to add microformats and RDFa to their toolkit.

Historically, the biggest block to the Semantic Web has been the lack of a killer app that would drive widespread adoption. There was always a bit of a chicken-and-egg problem, in which users would need to do a lot of work to mark up the data for the benefit of others before getting much of a payoff themselves. But as Dan Bricklin remarked so insightfully in his 2000 paper on Napster, The Cornucopia of the Commons, the most powerful online dynamics are released not by appeals to volunteerism, but by self-interest:

What we see here is that increasing the value of the database by adding more information is a natural by-product of using the tool for your own benefit. No altruistic sharing motives need be present...
(Aside: @akumar, this is the answer to your question on Twitter about why in writing up this announcement we didn't make more of Yahoo!'s prior support for microformats in searchmonkey. You guys did pioneering work, but Google has the market power to actually get people to pay attention.)

What I also find interesting about the announcement is the blurring line between machine learning and semantic markup.

Machine learning isn't just brute force analysis of unstructured data. In fact, while Google is famous as a machine-learning company, their initial breakthrough with pagerank was based on the realization that there was hidden metadata in the link structure of the web that could be used to improve search results. It was precisely their departure from previous brute force methods that gave them some of their initial success. Since then, they have been diligent in developing countless other algorithms based on regular features of the data, and in particular regular associations between data sets that routinely appear together - implied metadata, so to speak.

So, for example, people are associated with addresses, with dates, with companies, with other people, with documents, with pictures and videos. Those associations may be made explicitly, via tags or true structured markup, but given a large enough data set, they can be extracted automatically. Jeff Jonas calls this process "context accumulation." It's the way that our own brains operate: over time, we make associations between parallel data streams, each of which informs us about the other. Semantic labeling (via language) is only one of many of those data streams. We may see someone and not remember their name; we may remember the name but not the face that goes with it. We might connect the two given the additional information that we met at such and such conference three years ago.

Google is in the business of making these associations, finding pages that are about the same thing, and they use every available handle to help them do it. Seen in this way, SEO is already a kind of semantic markup, in which self-interested humans try to add information to pages to enhance their discoverability and ranking by Google. What the Rich Snippets announcement does is tell webmasters and SEO professionals a new way to add structure to their markup.

The problem with explicit metadata like this is that it's liable to gaming. But more dangerously, it generally only captures what we already know. By contrast, implicit metadata can surprise us, giving us new insight into the world. Consider Flickr's maps created by geotagged photos, which show the real boundaries of where people go in cities and what they do there. Here, the metadata may be added explicitly by humans, but it is increasingly added automatically by the camera itself. (The most powerful architecture of participation is one in which data is provided by default, without the user even knowing he or she is doing it.)

Google's Flu Trends is another great example. By mining its search database (what John Battelle calls "the database of intentions") for searches about flu symptoms, Google is able to generate maps of likely clusters of infection. Or look at Jer Thorp's fascinating project announced just the other day, Just Landed: Processing, Twitter, MetaCarta & Hidden Data. Jer simulated the possible spread of swine flu built by extracting the string "Just landed in..." from Twitter. Since Twitter profiles include a location, and the object of the phrase above is also likely to be a location, he was able to create the following visualization of travel patterns:

Just Landed - Test Render (4 hrs) from blprnt on Vimeo.

This is where the rubber meets the road of collective intelligence. I'm a big fan of structured markup, but I remain convinced that even more important is to discover new metadata that is produced, as Wallace Stevens so memorably said, "merely in living as and where we live."

P.S. There's some small irony that in its first steps towards requesting explicit structured data from webmasters, Google is specifying the vocabularies that can be used for its Rich Snippets rather than mining the structured data formats that already exist on the web. It would be more "googlish" (in the machine learning sense I've outlined above) to recognize and use them all, rather than asking webmasters to adopt a new format developed by Google. There's an interesting debate about this irony over on Ian Davis' blog. I expect there to be a lot more debate in the weeks to come.

Please login to comment
  • No comments yet.
knol.google.com

Rich Snippets Tips and Tricks - collection deRich Snippets

Sliced by
-
07-07-2010 07-07-2010
-
knol.google.com Google Rich Snippets provides structured data in Google search result snippets. Webmasters can provide this structured data...

Basic instructions


For basic instructions on how to mark up your site for Rich Snippets, refer to the Rich Snippets help pages:



Tips and tricks


Test your markup with the Rich Snippets Testing Tool

Use this tool to check your markup and make sure that Google can extract the structured data from your page. This tool will display the markup found on a specific web page, as well as a preview of how that page might appear in Google search results.

Note that the tool will only display extracted information that is officially supported for Rich Snippets (Reviews, People, Products, and Businesses and Organizations). Microformats or RDFa data that isn't supported generally won't be shown in the tool. Supported RDFa attributes are xmlns, typeof, property, rel, and content.

This is an early release of the testing tool, and feedback is welcome. Please submit feedback or bugs on the Webmaster Tools forum

How to mark up ratings that don't use a 5-point scale

Instructions and examples for marking up reviews that use scales that don't range from 1 (worst) to 5 (best).

Marking up price ranges

Some sites, like Yelp, show Rich Snippets with a price range. But the Rich Snippets help pages don't list how this can be done. What's the deal?!

Google Rich Snippets recognizes the pricerange attribute as an unofficial extension to hCard (microformats) or Organization (RDFa). This can be used in order to show general price ranges associated with a business.

Beware of microformats/CSS naming collisions

A common mistake made when using microformats is to reuse the same values for class attributes when they mean two different things. For example, if you include an hReview on your page, and include a class="rating" inside of it, Google assumes that the text inside the tag containing class="rating" is in fact a rating. But if class="rating" is also used for a different purpose on your page, for example to change the font of some text on the page related to the rating, your content will not be parsed correctly by Google's structured data parsers. When using microformats, make sure that you aren't inadvertently using the same class name for multiple purposes.

Reviews vs. Votes

Some sites have a separate notion of "reviews" and "votes." A product may have received 4 stars based on 55 "votes" but only 8 people actually wrote reviews. In cases like these, make sure that you mark up the number of reviews, not the number of votes, using "count" in hReview-aggregate (microformats) or Review-aggregate (RDFa). An RDFa example is as follows:
Correct:
...
<span property="v:rating">4</span> stars (based on 55 votes)
<span property="v:count">
8</span> reviews
...
Incorrect:
...
<span property="v:rating">4</span> stars (based on <span property="v:count">55</span> votes)
8 reviews
...

Hidden div's -- don't do it!

It can be tempting to add all the content relevant for a rich snippet in one place on the page, mark it up, and then hide the entire block of text using CSS or other techniques. Don't do this! Mark up the content where it already exists. Except in special circumstances (for example when marking the best possible rating for review sites that don't use a 5-point rating scale), Google will not show content from hidden div's in Rich Snippets.


Frequently Asked Questions


Who is eligible for Rich Snippets?

Currently, review sites and social networking/people profile sites are eligible. We plan to expand Rich Snippets to other types of content in the future.

How do I get Rich Snippets to show up for my site?

First, make sure you are marking up your pages in a way that Google understands. Use the Rich Snippets Testing Tool to see if Google's parsers can extract the data that you have marked up. Once you have marked up content on the relevant pages across your site and confirmed that the marked up content can be extracted successfully by Google, sign up on the "Interested in Rich Snippets" form. Rich snippets from new sites will be enabled automatically from this list over time.

Google does not guarantee that Rich Snippets will show up for search results from a particular site even if structured data is marked up and can be extracted successfully according to the testing tool. Here are some reasons that marked-up pages might not be shown with Rich Snippets:

  • The marked-up structured data is not representative of the main content of the page.
  • Marked-up data is incorrect or misleading.
  • Marked up content is hidden from the user (see the section above: "Hidden div's -- don't do it!")
  • The site has very few pages (or very few pages with marked-up structured data) and may not be picked up by Google's Rich Snippets system.

Why am I seeing "Insufficient data to generate the preview" in the Rich Snippets Testing Tool?

If you have added markup to your web page, then the Rich Snippets Testing Tool should show the markup that was found in the "Extracted Rich Snippets data from the page" section. However, this does not necessarily mean that a Google search preview will be generated. There are several ways in which this situation can arise.

1) You have provided information which is not currently used to change the display of search results. 
Currently we show rich snippets for review sites and social networking/people profile sites. Video markup is recognized and used to make sure that video content is crawled and indexed properly and can show up amongst Google's video results. However, no preview is generated. Organization and Product markup is recognized but is not yet used to affect the display of search results. In each of these cases, the markup can be correct without any preview being generated in the testing tool.

2) You have not provided enough information to show a rich snippet.
When providing Review or Person information, certain data is required in order to generate a rich snippet preview. For example, a Review without a reviewer or a Review-aggregate without a count will not generate a preview. A Person without enough marked up information will not generate a rich snippet either.

3) There are errors in the markup.
Make sure that the information that you marked up is showing up in the "Extracted data from the page" section. If it doesn't appear, Google didn't find the markup. For webmasters using RDFa markup, make sure you use the correct property names. For example, the correct property name for marking the number of reviews on a page is called "count." If you mark up the number of reviews using a property labeled "reviewCount" instead of "count" no preview will be generated.

Are other Google services (Maps, Product Search, etc) affected by Rich Snippets markup?

Data provided through microformats or RDFa markup is used in some ways by other Google services, but Rich Snippets does not replace any other existing channels for providing information to Google. If you are providing data to Google in a different way (for example, if you are providing a Base feed for Google Product Search), keep doing what you are doing.

Why doesn't Google support [insert your favorite RDFa vocabulary here]?

This is an evolving process, and the formats supported at initial launch are a first step. Support for additional popular formats will be added over time assuming that the data helps users find search results more effectively and that there is significant usage of the format across the web.

Please login to comment
  • No comments yet.

Microformats: What They Are and How To Use Them - Smashing Magazine

Sliced by
-
07-07-2010 07-07-2010
-
www.smashingmagazine.com

Microformats: What They Are and How To Use Them

Advertisement

Web 2.0 has its positive and its negative sides. Apart from tremendous technological improvements, provided by Ajax, semantically organized content and the growing popularity of RSS-Feeds, the term “Web 2.0″ still hadn’t managed to assert itself as the renewed Web rather than a new revolutionary technology as it is mistakenly being called.

Microformats-04 in Microformats: What They Are and How To Use Them

Consequence: many renewed techniques, which somehow seem to be related to the “new” Web, aren’t fully or properly understood. This results in public misunderstandings and keeps both developers and users away from the use (the improvement) of these techniques.

One of the new terms on the horizon is Microformats (sometimes abbreviated µF or uF) – formats, which make it possible to create meta-content which can be not only read, but also understood by machines (which was the basic idea of Semantic Web, which is not Web 2.0). This post is supposed to give you an idea, what Microformats actually mean, which advantages they have and how you can use them to enrich your content and make it more visible and understandable for search engines.

[Offtopic: by the way, did you know that there is a Smashing eBook Series? Book #2 is Successful Freelancing for Web Designers, 260 pages for just $9,90.]

Things you should know about Microformats

Microformats-01 in Microformats: What They Are and How To Use Them

  • “Designed for humans first and machines second, microformats are a set of simple, open data formats built upon existing and widely adopted standards.” [Microformats: Official definition]
  • “Microformats is the generic name given to any format that builds on XML (X)HTML to provide additional metadata about web objects.” [Microcontent Design]
  • “Microformats are simple codes that you can use to identify specific kinds of data, like people or events, in your webpages.” [Chris Messina]
  • “A microformat is a piece of mark up that allows expression of semantics in an HTML (or XHTML) web page. Programs can extract meaning from a web page that is marked up with one or more microformats.” [Wikipedia: Microformats]
  • “With Microformats, you can send & publish things like events, business cards, and product reviews as meaningful XHTML that a person can read in a browser, but a program can import, index and remix as native data.” [Michael McCracken]
  • “Microformats are about using the standards we all know [...] to convey as much semantic meaning as possible. They use current XHTML tags such as address, cite, and blockquote and attributes such as rel, rev, and title to create semantically appropriate blocks of code.” [Microformats Primer]
  • “Microformats are not a new language, but adapted to current behaviors and usage patterns and is connected with semantic XHTML.” [About Microformats]
  • “Microformats principles: solve a specific problem, simple as possible, reuse from widely adopted standards (semantic (X)HTML), modularity / embeddability, decentralized development, content, services. [What are microformats]
  • “That’s what microformats are, adding semantics to markup to take it from being machine readable to being machine understandable.” [Microformats: Introduction]
  • “There are lots of different microformats, ranging from very fundamental types of information like contacts, locations, and events, to the slightly more domain specific, like reviews and resumes, to the very domain specific, like wines.”[Microformats: Introduction]

Existing Microformats

Microformats-03 in Microformats: What They Are and How To Use Them

  • hAtom
    hAtom is a microformat for content that can be syndicated, primarily but not exclusively weblog postings. hAtom is based on a subset of the Atom syndication format.
  • hCalendar | hCalendar Creator
    hCalendar is a simple, open, distributed calendaring and events format, suitable for embedding in (X)HTML, Atom, RSS, and arbitrary XML.
  • hCard | hCard Creator
    hCard is a format for representing people, companies, organizations, and places, in semantic XHTML.
  • hResume | hResume Creator
    hResume is a microformat for publishing resumes and CVs.
  • hReview | hReview Creator
    hReview is an open, distributed format, suitable for embedding reviews (of products, services, businesses, events, etc.) in (X)HTML, Atom, RSS, and arbitrary XML.
  • rel="nofollow"
    Is an HTML attribute value used to instruct search engines that a hyperlink should not influence the link target’s ranking in the search engine’s index. Regarded as a microformat.
  • rel="tag"
    By adding rel=”tag” to a hyperlink, a page indicates that the destination of that hyperlink is an author-designated “tag” (or keyword/subject) for the current page. Note that a tag may just refer to a major portion of the current page (i.e. a blog post). e.g. by placing this link on a page,
    <a href="http://technorati.com/tag/tech" rel="tag">tech</a>, the author indicates that the page has the tag “tech”.
  • XFN
    XHTML Friends Network (XFN) is a simple way to represent human relationships using hyperlinks developed by Global Multimedia Protocols Group. XFN enables web authors to indicate their relationship(s) to the people in their blogrolls simply by adding a ‘rel’ attribute to their <a href> tags, e.g.:
    <a href="http://jeff.example.org" rel="friend met">.
  • XOXO
    XOXO (eXtensible Open XHTML Outlines) is an XML format for outlines built from XHTML modularization. Developed by several authors as an attempt to reuse XHTML building blocks instead of inventing unnecessary new XML elements/attributes, XOXO is both based on existing behavior of publishing outlines, lists, and blogrolls on the Web, and as a general outline format for 1:1 processing of fundamental programming language datastructures.
  • xFolk
    xFolk is a simple and open format for publishing collections of bookmarks.

Advantages of Microformats

  • “Say you want to sell your car. [...] What if we could somehow post a listing to our blog, and then easily let services which cared about classifieds listings know that there is a new or updated classified at my site. The missing piece that would enable this is a standard format (after all html doesn’t have a element).” [Add Microformats Magic to your site]
  • “Now your information is scattered all over the Web, and you have to pick which sites you want to use. Soon: the combination of blogging and microformats is now reversing this model. Now, your information remains in your blog, and the Web sites come to you. For instance, if you want to sell something, you can blog about it using an hListing, and a site like edgeio will find it when it aggregates classified advertisements across the Web.” [Microformats: Introduction]
  • “Microformats enable the publishing and sharing of higher fidelity information on the Web. Small bits of (X)HTML that identify richer data types like people and events in your webpages. Building blocks that enable users to own, control, move, and share their data on the Web.” [What are microformats]
  • “Like CSS, microformats let you to do some interesting things through JavaScript and the DOM. After all, microformats are just a bunch of XHTML.” [Microformats Primer]
  • Benefits of Microformats: they are (search) machine-readable, accurate and appropriate metadata, meaningful markup.
  • With Microformats “you can create more consistent content. You can share your microformat with content providers, ensuring that you’ll get content in the right format. You don’t need to DO anything to that content before you present it to users.” [The Awesome Power of Microformats]
  • “So what use would microformats be in a web browser? [...] Future Web browsers are likely going to associate semantically marked up data you encounter on the Web with specific applications, either on your system or online. This means the contact information you see on a Web site will be associated with your favorite contacts application.” [Mozilla Does Microformats]
  • “The idea is that i.e. as soon as any page that has an hCard on it you can add to your address book, you can sync it with your PDA, your handheld, and it makes contact information, personal information, on the web a lot more useful.” [Microformats: Evolving the Web]

Microformats are already being used!

  • Edgeio.com (Weblog based business as niche for small and large companies), Rubhub.com (determines relationships between websites and peoples, scenarios: find alternative connections for supplies in producer chains,
    bookseller, car suppliers, internal contact management within large companies), Technorati.com (indexes hCard, hCalendar, and hReview, and also cumulative data is updated via event-driven pings)
  • Microformats can be used within Firefox Extensions (Tails, Greasemonkey scripts for hCard, hCalendar, xFolks, etc.) and Blogging Extensions (Structured Blogging for Wordpress)

Articles About Microformats

Microformats Tools

Microformats-00 in Microformats: What They Are and How To Use Them

  • Microformats Bookmarklet
    helps to extract existing hCards and hCalendars and shows and stores existing contacts and events.
  • Tails Export
    An extension for Showing and Exporting Microformats. Currently it supports hCard [export to .vcf file], hCalendar [export to .ics file], hReview, xFolk and Rel-license.
  • Highlight Microformats with CSS
    Those that use Firefox with the Tails extension, read no further. This is not for you. You have it given to you on a plate, you don’t know how lucky you are. This is for those of us using Camino, Safari or Omniweb.
  • Operator
    Operator leverages microformats that are already available on many web pages to provide new ways to interact with web services. It lets you combine pieces of information on Web sites with applications in ways that are useful. For instance, Flickr + Google Maps, Upcoming.org + Google Calendar, Yahoo! Local + your address book, and many more possibilities and permutations.
  • Microformats Dreamweaver Extension
    Microformats Dreamweaver extension (ideally for use with Dreamweaver 8, although should work for MX and above) implements a few simple Insert Bar Objects to help Dreamweaver users to add hCalendar, hCard, rel-license, rel-tag and XFN data to their documents. After installing, you’ll find a new Microformats category on your Insert Bar. Support for more formats is to follow, so check back.
  • microformats.css
    A CSS-based template for existing microformats, based upon the microformats cheatsheet (PDF)
  • Microformats Cheat Sheet
    This Microformats Cheat Sheet covers iCalendar, hCalendar, hReview, vCard, hCard, RelLicense, RelTag, XFN Format and Values and Dates.
  • Microformats Cheat Sheet
    This microformats cheat sheet lists the properties by format and also lists each format and the hierarchy. This includes elemental microformats, compound microformats and some of the standard design patterns used.
  • Microformats Icons
    The starter set contains icons for hCal, hResume, hCard, XFN and a generic TAG icon.

Tutorials, Introductions to Microformats

Blogs & Wikis

  • Microformats.org
    Designed for humans first and machines second, microformats are a set of simple, open data formats built upon existing and widely adopted standards. Learn more about microformats.
  • Microformats Wiki
    What are microformats? What can you do with them?
  • microformatique
    Microformatique is an unofficial blog covering all things microformats, and “data at the edges”. Latest specifications, presentations, events, publications and more. It’s put togther by John Allsopp
Please login to comment
  • No comments yet.