Semantic Web, SEO and Google Rich Snippets8
Google's Rich Snippets and the Semantic Web - O'Reilly Radar
Sliced by
-
07-07-2010
07-07-2010
-
radar.oreilly.com
There's a long-time debate between those who advocate for semantic markup, and those who believe that machine learning will eventually get us to the holy grail of a Semantic Web, one in which computer programs actually understand the meaning of what they see and read. Google has of course been the great proof point of the power of machine learning algorithms. Earlier this week, Google made a nod to the other side of the debate, introducing a feature that they call
Google's Rich Snippets and the Semantic Web
So, for example, consider the snippet for the Yelp review page on the Slanted Door restaurant in San Francisco:
The snippet is enhanced to show the number of reviews and the average star rating, with a snippet actually taken from one of the reviews. By contrast, the Citysearch results for the same restaurant are much less compelling:
(Yelp is one of Google's partners in the rollout of Rich Snippets; Google hopes that others will follow their lead in using enhanced markup, enabling this feature.)
Rich snippets could be a turning point for the Semantic Web, since, for the first time, they create a powerful economic motivation for semantic markup. Google has told us that rich snippets significantly enhance click-through rates. That means that anyone who has been doing SEO is now going to have to add microformats and RDFa to their toolkit.
Historically, the biggest block to the Semantic Web has been the lack of a killer app that would drive widespread adoption. There was always a bit of a chicken-and-egg problem, in which users would need to do a lot of work to mark up the data for the benefit of others before getting much of a payoff themselves. But as Dan Bricklin remarked so insightfully in his 2000 paper on Napster, The Cornucopia of the Commons, the most powerful online dynamics are released not by appeals to volunteerism, but by self-interest:
What we see here is that increasing the value of the database by adding more information is a natural by-product of using the tool for your own benefit. No altruistic sharing motives need be present...
What I also find interesting about the announcement is the blurring line between machine learning and semantic markup.
Machine learning isn't just brute force analysis of unstructured data. In fact, while Google is famous as a machine-learning company, their initial breakthrough with pagerank was based on the realization that there was hidden metadata in the link structure of the web that could be used to improve search results. It was precisely their departure from previous brute force methods that gave them some of their initial success. Since then, they have been diligent in developing countless other algorithms based on regular features of the data, and in particular regular associations between data sets that routinely appear together - implied metadata, so to speak.
So, for example, people are associated with addresses, with dates, with companies, with other people, with documents, with pictures and videos. Those associations may be made explicitly, via tags or true structured markup, but given a large enough data set, they can be extracted automatically. Jeff Jonas calls this process "context accumulation." It's the way that our own brains operate: over time, we make associations between parallel data streams, each of which informs us about the other. Semantic labeling (via language) is only one of many of those data streams. We may see someone and not remember their name; we may remember the name but not the face that goes with it. We might connect the two given the additional information that we met at such and such conference three years ago.
Google is in the business of making these associations, finding pages that are about the same thing, and they use every available handle to help them do it. Seen in this way, SEO is already a kind of semantic markup, in which self-interested humans try to add information to pages to enhance their discoverability and ranking by Google. What the Rich Snippets announcement does is tell webmasters and SEO professionals a new way to add structure to their markup.
The problem with explicit metadata like this is that it's liable to gaming. But more dangerously, it generally only captures what we already know. By contrast, implicit metadata can surprise us, giving us new insight into the world. Consider Flickr's maps created by geotagged photos, which show the real boundaries of where people go in cities and what they do there. Here, the metadata may be added explicitly by humans, but it is increasingly added automatically by the camera itself. (The most powerful architecture of participation is one in which data is provided by default, without the user even knowing he or she is doing it.)
Google's Flu Trends is another great example. By mining its search database (what John Battelle calls "the database of intentions") for searches about flu symptoms, Google is able to generate maps of likely clusters of infection. Or look at Jer Thorp's fascinating project announced just the other day, Just Landed: Processing, Twitter, MetaCarta & Hidden Data. Jer simulated the possible spread of swine flu built by extracting the string "Just landed in..." from Twitter. Since Twitter profiles include a location, and the object of the phrase above is also likely to be a location, he was able to create the following visualization of travel patterns:
Just Landed - Test Render (4 hrs) from blprnt on Vimeo.
This is where the rubber meets the road of collective intelligence. I'm a big fan of structured markup, but I remain convinced that even more important is to discover new metadata that is produced, as Wallace Stevens so memorably said, "merely in living as and where we live."
P.S. There's some small irony that in its first steps towards requesting explicit structured data from webmasters, Google is specifying the vocabularies that can be used for its Rich Snippets rather than mining the structured data formats that already exist on the web. It would be more "googlish" (in the machine learning sense I've outlined above) to recognize and use them all, rather than asking webmasters to adopt a new format developed by Google. There's an interesting debate about this irony over on Ian Davis' blog. I expect there to be a lot more debate in the weeks to come.
tags: google, microformats, semantic web
| comments: 18
submit:
- No comments yet.
Rich Snippets Tips and Tricks - collection deRich Snippets
Sliced by
-
07-07-2010
07-07-2010
-
knol.google.com
Google Rich Snippets provides structured data in Google search result snippets. Webmasters can provide this structured data...
Basic instructions
Tips and tricks
Test your markup with the Rich Snippets Testing Tool
How to mark up ratings that don't use a 5-point scale
Marking up price ranges
Beware of microformats/CSS naming collisions
Reviews vs. Votes
Correct:
...
<span property="v:rating">4</span> stars (based on 55 votes)
<span property="v:count">8</span> reviews
...
Incorrect:
...
<span property="v:rating">4</span> stars (based on <span property="v:count">55</span> votes)
8 reviews
...
Hidden div's -- don't do it!
Frequently Asked Questions
Who is eligible for Rich Snippets?
- The marked-up structured data is not representative of the main content of the page.
- Marked-up data is incorrect or misleading.
- Marked up content is hidden from the user (see the section above: "Hidden div's -- don't do it!")
- The site has very few pages (or very few pages with marked-up structured data) and may not be picked up by Google's Rich Snippets system.
Are other Google services (Maps, Product Search, etc) affected by Rich Snippets markup?
Why doesn't Google support [insert your favorite RDFa vocabulary here]?
- No comments yet.
Microformats: What They Are and How To Use Them - Smashing Magazine
Sliced by
-
07-07-2010
07-07-2010
-
www.smashingmagazine.com
Microformats: What They Are and How To Use Them
- By Vitaly Friedman
- May 4th, 2007
- Coding
- 66 Comments
- Publishing Policy
Web 2.0 has its positive and its negative sides. Apart from tremendous technological improvements, provided by Ajax, semantically organized content and the growing popularity of RSS-Feeds, the term “Web 2.0″ still hadn’t managed to assert itself as the renewed Web rather than a new revolutionary technology as it is mistakenly being called.
Consequence: many renewed techniques, which somehow seem to be related to the “new” Web, aren’t fully or properly understood. This results in public misunderstandings and keeps both developers and users away from the use (the improvement) of these techniques.
[Offtopic: by the way, did you know that there is a Smashing eBook Series? Book #2 is Successful Freelancing for Web Designers, 260 pages for just $9,90.]
Things you should know about Microformats
-
“Designed for humans first and machines second, microformats are a set of simple, open data formats built upon existing and widely adopted standards.” [ Microformats: Official definition] - “Microformats is the generic name given to any format that builds on XML (X)HTML to provide additional metadata about web objects.” [Microcontent Design]
-
“Microformats are simple codes that you can use to identify specific kinds of data, like people or events, in your webpages.” [ Chris Messina] - “A microformat is a piece of mark up that allows expression of semantics in an HTML (or XHTML) web page. Programs can extract meaning from a web page that is marked up with one or more microformats.” [Wikipedia: Microformats]
- “With Microformats, you can send & publish things like events, business cards, and product reviews as meaningful XHTML that a person can read in a browser, but a program can import, index and remix as native data.” [Michael McCracken]
-
“Microformats are about using the standards we all know [...] to convey as much semantic meaning as possible. They use current XHTML tags such as address, cite, and blockquoteand attributes such as rel, rev, and titleto create semantically appropriate blocks of code.” [ Microformats Primer] - “Microformats are not a new language, but adapted to current behaviors and usage patterns and is connected with semantic XHTML.” [About Microformats]
-
“Microformats principles: solve a specific problem, simple as possible, reuse from widely adopted standards (semantic (X)HTML), modularity / embeddability, decentralized development, content, services. [ What are microformats] - “That’s what microformats are, adding semantics to markup to take it from being machine readable to being machine understandable.” [Microformats: Introduction]
-
“There are lots of different microformats, ranging from very fundamental types of information like contacts, locations, and events, to the slightly more domain specific, like reviewsand resumes, to the very domain specific, like wines.”[ Microformats: Introduction]
Existing Microformats
-
hAtom
hAtom is a microformat for content that can be syndicated, primarily but not exclusively weblog postings. hAtom is based on a subset of the Atom syndication format. -
hCalendar | hCalendar Creator
hCalendar is a simple, open, distributed calendaring and events format, suitable for embedding in (X)HTML, Atom, RSS, and arbitrary XML. -
hCard | hCard Creator
hCard is a format for representing people, companies, organizations, and places, in semantic XHTML. -
hResume
| hResume CreatorhResume is a microformat for publishing resumes and CVs. -
hReview | hReview Creator
hReview is an open, distributed format, suitable for embedding reviews (of products, services, businesses, events, etc.) in (X)HTML, Atom, RSS, and arbitrary XML. -
rel="nofollow"
Is an HTML attribute value used to instruct search engines that a hyperlink should not influence the link target’s ranking in the search engine’s index. Regarded as a microformat. -
rel="tag"
By adding rel=”tag” to a hyperlink, a page indicates that the destination of that hyperlink is an author-designated “tag” (or keyword/subject) for the current page. Note that a tag may just refer to a major portion of the current page (i.e. a blog post). e.g. by placing this link on a page,<a href="http://technorati.com/tag/tech" rel="tag">tech</a>, the author indicates that the page has the tag “tech”. -
XFN
XHTML Friends Network (XFN) is a simple way to represent human relationships using hyperlinks developed by Global Multimedia Protocols Group. XFN enables web authors to indicate their relationship(s) to the people in their blogrolls simply by adding a ‘rel’ attribute to their<a href>tags, e.g.:<a href="http://jeff.example.org" rel="friend met">. -
XOXO
XOXO (eXtensible Open XHTML Outlines) is an XML format for outlines built from XHTML modularization. Developed by several authors as an attempt to reuse XHTML building blocks instead of inventing unnecessary new XML elements/attributes, XOXO is both based on existing behavior of publishing outlines, lists, and blogrolls on the Web, and as a general outline format for 1:1 processing of fundamental programming language datastructures. -
xFolk
xFolk is a simple and open format for publishing collections of bookmarks.
Advantages of Microformats
- “Say you want to sell your car. [...] What if we could somehow post a listing to our blog, and then easily let services which cared about classifieds listings know that there is a new or updated classified at my site. The missing piece that would enable this is a standard format (after all html doesn’t have a
element).” [Add Microformats Magic to your site] - “Now your information is scattered all over the Web, and you have to pick which sites you want to use. Soon: the combination of blogging and microformats is now reversing this model. Now, your information remains in your blog, and the Web sites come to you. For instance, if you want to sell something, you can blog about it using an hListing, and a site like edgeio will find it when it aggregates classified advertisements across the Web.” [Microformats: Introduction]
- “Microformats enable the publishing and sharing of higher fidelity information on the Web. Small bits of (X)HTML that identify richer data types like people and events in your webpages. Building blocks that enable users to own, control, move, and share their data on the Web.” [What are microformats]
-
“Like CSS, microformats let you to do some interesting things through JavaScript and the DOM. After all, microformats are just a bunch of XHTML.” [ Microformats Primer] - Benefits of Microformats: they are (search) machine-readable, accurate and appropriate metadata, meaningful markup.
- With Microformats “you can create more consistent content. You can share your microformat with content providers, ensuring that you’ll get content in the right format. You don’t need to DO anything to that content before you present it to users.” [The Awesome Power of Microformats]
- “So what use would microformats be in a web browser? [...] Future Web browsers are likely going to associate semantically marked up data you encounter on the Web with specific applications, either on your system or online. This means the contact information you see on a Web site will be associated with your favorite contacts application.” [Mozilla Does Microformats]
- “The idea is that i.e. as soon as any page that has an hCard on it you can add to your address book, you can sync it with your PDA, your handheld, and it makes contact information, personal information, on the web a lot more useful.” [Microformats: Evolving the Web]
Microformats are already being used!
-
Edgeio.com (Weblog based business as niche for small and large companies), Rubhub.com(determines relationships between websites and peoples, scenarios: find alternative connections for supplies in producer chains, bookseller, car suppliers, internal contact management within large companies), Technorati.com(indexes hCard, hCalendar, and hReview, and also cumulative data is updated via event-driven pings) -
Microformats can be used within Firefox Extensions ( Tails, Greasemonkey scriptsfor hCard, hCalendar, xFolks, etc.) and Blogging Extensions ( Structured Bloggingfor Wordpress)
Articles About Microformats
-
microformats – What are microformats?
Designed for humans first and machines second, microformats are a set of simple, open data formats built upon existing and widely adopted standards. Learn more about microformats. The official web-page. -
Digital Web Magazine – Microformats Primer
Introductory article on Microformats by Garrett Dimon -
Add microformats magic to your site
Heard of the semantic web? Using Microformats everyone can contribute to the richness of the web. John Allsopp explains how. -
Digital Web Magazine – The Big Picture on Microformats
“In this article, we’ll review what people are doing with microformats right now, and finish up by looking at a couple of cool projects that might whet your appetite for microformats’ future prospects.” by John Allsopp -
Microformats: Evolving the Web
Jeremy Keith – This is a transcript of a panel I sat in on at South by Southwest 2006. My fellow panelists are Chris Messina and Norm! The moderator is Tantek Çelik. -
Mozilla Does Microformats
Firefox 3 as Information Broker – Richard MacManus -
What are Microformats?
A presentation by Tantek Çelik -
Introduction to Microformats
Microformats: Introduction, Structured Data, The Fundamental, Introducing Operator -
Microformats and the Decentralized Future of Online Marketing
Firefox’s Alex Faaborg has raised quite a few eyebrows with his piece on Microformats and the possibilities that exist for these platforms in terms of browser implementation -
Microformats Challenge Web Feeds and Web APIs!
Microformats are subversive: they not only challenge the approach of full-blown Semantic Web approaches, but even question fundamental Web 2.0 building blocks such as Web Feeds and Web APIs. -
The Awesome Power of Microformats
What Are Microformats? – by Kevin Lawver - Usable Microformats
If you’re relatively new to microformats, then this article was written with you in mind. You don’t need to have any prior knowldege to understand what’s going on here.by Andy Hume -
Microformats
Microformats – Designed for humans first.by Prof. Dr. Mathias Weske
Microformats Tools
-
Microformats Bookmarklet
helps to extract existing hCards and hCalendars and shows and stores existing contacts and events. -
Tails Export
An extension for Showing and Exporting Microformats. Currently it supports hCard [export to .vcf file], hCalendar [export to .ics file], hReview, xFolk and Rel-license. -
Highlight Microformats with CSS
Those that use Firefox with the Tails extension, read no further. This is not for you. You have it given to you on a plate, you don’t know how lucky you are. This is for those of us using Camino, Safari or Omniweb. -
Operator
Operator leverages microformats that are already available on many web pages to provide new ways to interact with web services. It lets you combine pieces of information on Web sites with applications in ways that are useful. For instance, Flickr + Google Maps, Upcoming.org + Google Calendar, Yahoo! Local + your address book, and many more possibilities and permutations. -
Microformats Dreamweaver Extension
Microformats Dreamweaver extension (ideally for use with Dreamweaver 8, although should work for MX and above) implements a few simple Insert Bar Objects to help Dreamweaver users to add hCalendar, hCard, rel-license, rel-tag and XFN data to their documents. After installing, you’ll find a new Microformats category on your Insert Bar. Support for more formats is to follow, so check back. -
microformats.css
A CSS-based template for existing microformats, based upon the microformats cheatsheet (PDF) -
Microformats Cheat Sheet
This Microformats Cheat Sheet covers iCalendar, hCalendar, hReview, vCard, hCard, RelLicense, RelTag, XFN Format and Values and Dates. -
Microformats Cheat Sheet
This microformats cheat sheet lists the properties by format and also lists each format and the hierarchy. This includes elemental microformats, compound microformats and some of the standard design patterns used. -
Microformats Icons
The starter set contains icons for hCal, hResume, hCard, XFN and a generic TAG icon.
Tutorials, Introductions to Microformats
-
Tutorials on Microformats
This series of articles deals with numerous aspects of Microformats, including basic theory and purpose of Microformats, hCard, hCalendar, AHAH, hReview, xFolk, hResume, XOXO and hAtom. -
Intro to microformats
Confused, alarmed, disparaged? Let’s clear that up. An extensive introduction to the theory and use of Microformats. -
Introduction to Microformats + a look at hCard & hAtom
Mike Jolley explains step-by-step, what Microformats are, how they can be integrated in web-pages and how you can enhance the efficicency of your content using them. -
Pairing Wine and Microformats
Microformats in Practice: Dan Cederholm about the use of Microformats in Cork’d. -
Microformats in Web Browsers
This is a concept for putting Microformats ‘auto-discovery’ user interface in a web browser. Any web browser (although the sketches were original conceived as a Firefox extension).by Ben Ward -
Wikipedia: Microformats
The Wikipedia Entry. -
Practical Microformats
Microformats from the Ground Up – an extensive tutorial, by Ryan King and Brian Suda -
Using Microformats in WordPress
There are two approaches you can take. One: Manually pasting relevant microformat code created via microformat creators. Step-by-step instructions are as follows.
Blogs & Wikis
-
Microformats.org
Designed for humans first and machines second, microformats are a set of simple, open data formats built upon existing and widely adopted standards. Learn more about microformats. -
Microformats Wiki
What are microformats? What can you do with them? -
microformatique
Microformatique is an unofficial blog covering all things microformats, and “data at the edges”. Latest specifications, presentations, events, publications and more. It’s put togther by John Allsopp
- No comments yet.
Loading more Slices...







