Google I/O 2009 – Site Review by the Experts

By | November 5, 2019


Cutts: So if you have a site
that you want to throw up here– we got one on a piece of paper
just before the session– if you have a site, you know,
write it down, bring it on up. We’ll be looking at it throughout the course
of this session. If we don’t get to look
at your site, feel free to catch me
or Brian or Greg afterwards. So it’s a good chance
to introduce, folks, Greg Grothaus, Brian White. We’re all members of Google’s
search-quality team. So our job is to make sure
that if you type in your name, you don’t get porn back. Right?
Unless you’re a porn star. If you’re a porn star,
then you want to get porn back. But if you just type in,
you know, “iPod car”
or something like that, you want to get
a really relevant result. So without further ado, we’re going to start
jumping right in. If you have a site
you’d like to get reviewed, again, just write it down,
bring it up. Brian or Greg
will start looking at it. I’ll start looking at it.
Yeah, bring up a business card. Anything like that. Okay, cool. Excellent.
All right. And worst case,
we’ll catch you afterwards. Okay, so this one
was kind of interesting. iCarKits.com. This was a submission
in advance. Is the owner of iCarKits
in the room? Excellent.
So I can be mean if I want to. I can be sarcastic
if I want to. There was some good stuff, and there was some bad stuff
about this site. Mostly good, though. When you look at it, what do you think
this site is about? What’s the impression
that you get? They put iPod kits
in cars, right? That’s pretty clear. So it’s a nice thing
that you know what does this site do. We’ll look at a site or two
later on where it’s like,
why does this site exist? We’re not quite sure. And in fact,
normally I do this at SEO–search engine
optimization–conferences. And I get a lot
of really bad ones. Like cheap-hotels-biz-discounts-
online-4you/viagra or something like that. And all the submissions
were pretty much white hat, so I might throw in a few
spam results that we’ve seen, just to kind of spice it up
a little bit. Now, one thing
you might not notice: I’ll bet whoever did
iCarKits.com is proud of the fact that–
do you see this little iPod? That’s actually active. It’s searchable, which is kind of cute, but you
might not realize it, right? So let’s pretend
I have an Audi. And let’s pretend that I have,
you know, an Audi A4. So you can click through, you can pick
a random model year, and then for that,
do I have navigation? Sure, yeah, why not.
I’ve got navigation. For that, you can see
the products that you have. That’s pretty cool. I haven’t actually seen
that sort of active scroller within a picture, right? That’s kind of engaging. But do you think
an average user, who’s not necessarily savvy,
who lands on that page knows to go looking
at the iPod? Probably not. So one of the things that we
always try to communicate is provide multiple ways
to get to a page. If you can have
drop-down boxes, these sort of clickable
search stuff, search engines might be able
to crawl through that. But it’s really nice if you can
have static HTML links, because the crawler pretty much
only knows how to crawl through static HTML links. Now, Google actually
can read JavaScript. And not just stupid, look
for a URL in the JavaScript. We can sometimes click
and execute and try to do really neat things
with JavaScript. But not every search engine
can do that. You don’t want to make it hard
for the search engines to get to your site, so you
don’t want the search engine to have to fill out a form
to find these products. But once you land here,
it’s actually pretty good. Now, one thing that I noticed– you know, they’ve got
text phone numbers, so that’s kind of nice– a lot of people will put their
information in an image, right? “Free shipping.” See this “free shipping”?
That’s an image. Unless you said “free shipping”
in the ALT tag, there’s no way we can actually
read the content of that image. Now, there are people at Google
who have said, “We should OCR the entire Web.” Right? We’ve talked about it, right? Google Books.
We do OCR our own stuff. It would be really cool
if we should OCR “free shipping,” you know,
and “enter the discount code,” and you could index that. And then if somebody searched
for “iPod car kit,” with “free shipping,”
we might return that. But don’t rely on search engines
to OCR the entire Web, ’cause it doesn’t happen
right now. It’s not going to happen
anytime soon. So the more you can have
nice links, like here’s the phone number
just in straight text, the better off that’ll work. What else do you notice
about this? Anybody notice anything strange
about the URL? The URL is
“advanced_search_result.php? “set_questions=1
sort=4d &search_question[7]=14.” That’s a pretty hard URL
to crawl. If I were designing this site
from scratch, I would probably make it
iCarKits.com/products or catalog/audi/a4/2006, right? Because you can encode
all those parameters in a very nice way
that’s sort of like a tree structure,
and anyone could click around. As long as you have
these sort of, you know, multiple parameters on the URL where it’s almost
like search results, it’s a lot harder for
search engines to get through and try to find
those sort of sites. Does that make sense? So if you can make something
like a very static, nicely crawlable link
or set of links, that’s going to be
a lot more useful. I don’t want to dwell
on this one too long, but in general,
it’s a pretty good site. It’s crawlable,
you can click around on links, and you can usually find stuff. And Google does
a pretty good job on it. But I did want to mention
one little tidbit. You’re developers, right? How many people here
are responsible for a Website
or a blog in some way? A forum? Okay. Why do you run your Website? Say again? For people to look at it. Fair enough. But why do you want them
to look at it? What do you want them–what do
you want them to get out of it? To make money.
Exactly. This guy doesn’t want
to educate the world about how to, you know,
install iPod car kits. He wants to make money.
And that’s okay. He can provide a great service
and great information. So ask yourself, what do you
want out of your Website? I happen to run a blog. I don’t run any ads. I often ask myself,
“Why do I run my Website?” Like, what’s the point? I just throw stuff up there
on the Web. I don’t get anybody
to subscribe to newsletters or anything like that. So you should ask yourself, what are you trying to get out
of your Website? This guy
is trying to get sales. If you look at the title
of the Website– I’m gonna hover over it
in Chrome– iPod car adapters–
2006 Audi A6– This is pretty good. You notice he has the word
“iPod car.” I would bet you
dollars to doughnuts that probably his top phrase
is “iPod car.” A lot of people
who are caring about phrases, they’ll put that right there
in their title. And I would probably bet you
that he is getting maybe hundreds of visitors
a day, which is pretty good. If you can get a few of those
to convert at, you know, 180 bucks a pop,
that’s not bad. But one thing I noticed
when I looked at this site last night
is he doesn’t even mention the most popular product
for the iPod and cars. So let’s see if I can
bring it up. There’s a Google AdWords tool,
and I’ll warn you in advance that the Wi-Fi in this room
is a little bit spotty, so I tried to preload
as many tabs as I could. But if you go to this tool,
which you can just search for with “Google AdWords tool”
or “Google keyword tool,” what it will let you do
is it will let you type in a phrase like “iPod car.” And whenever I did that
last night, it said,
“You know what? People type in ‘iPod car,’
like, 100 times a day.” What do you think they type in
250,000 times a day? There’s a product called iTrip. It’s from Griffin. And what it does is it turns
your iPhone into a little FM transmitter. You know, it transmits
over the radio. The word–
It’s starting to come up. The word “iTrip” does not appear
anywhere on this site. So if you’re a developer,
if you care about making money, you know, do a little bit
of research. What are the words
that people are going to use to find your site? So I solve my little CAPTCHA.
I type in “iPod car.” And eventually what it tells you
is, “Hey, one of the big, “big phrases that people use in addition to ‘iPod car’
is iTrip.” And this site doesn’t use
that word once. Question. Yeah. Mm-hmm. Absolutely. So the question is,
“How do you get around “the fact that, you know,
maybe you don’t– maybe you don’t sell
the iTrip?” The fact is, people make,
you know, there are affiliate programs
for the iTrip. So what you can see right here
is iTrip is queried, like, 240,000 a month in April. And if you were
to do the query “iTrip,” um, there’s Griffin,
there’s all this stuff. I happen to know the guy
that’s number one here, right? And it’s one guy in New York. So it’s not that iTrip is
some hugely competitive phrase that you could never
show up for. They could at least put a page
about the iTrip and start to show up
for some of those searches. Now, I don’t want to get
too hung up. We’ve all–we’ve spent
almost ten minutes. So let’s, uh, look at a couple
other sites. Um… is the owner
of fazalfurniture.com in the room? Good. Good.
I was hoping they wouldn’t be, um, because I have to give
a little bit of tough love to fazalfurniture. So here’s what the site
looks like. Uh, notice there’s a lot
of bolding on this site. If you were a visitor
and you landed on this page, would you be comfortable
with this amount of bolding? Uh, read a few of the words. It’s almost like somebody’s not
necessarily a native speaker. This is the very first thing
a visitor’s going to see. They’re–that’s going to be
their impression of your site. So what’s interesting
is this site is only about six pages deep. There’s a front page
and then there’s a details form for each one of these phrases. And what’s kind of interesting
is you click to get more details on this, say, Indonesian chair,
so you get a little picture, the picture
never gets bigger, right? “Oh, I want more details.
Yeah, show me a bigger picture.” No, the picture never changes.
It stays the same size. So there’s all kinds of ways that you could entice people
to order, and they’re not necessarily
going to get enough information to feel comfortable. There’s something else
you need to think about, which is the fit and finish
of your Website. And we’ll go through
a couple examples of that. Does anybody see anything
that either gives them a huge amount of confidence
in buying from this site or maybe a little bit of pause
where they’d worry? Yeah, the co–
yeah, the contact order, you know, is–it turns out
it’s just a Web form. See this Rattan furniture? If you click to view details, uh, they keep spelling Rattan
a different way, right? And if I were buying
for a motel, I’d be, like, “Well,
don’t you know how to spell the type of furniture
that you’re selling?” Either you do, in which case
you’re deliberately misspelling, which looks bad, or you don’t. And if you don’t know
how to spell the ingredients of your, uh, your furniture, that’s a pretty bad sign
as well. So this is the sort of thing
where a– See, right here
they’re calling it “rottan” furniture, right? And it may very well be
rotten furniture. We don’t know.
But–yeah, go for it. Can we, uh, get
the other lap up if… White: Looks like your
background image it’s using, the nice–really nice
word grain is, um, 1 megabyte. 1,000 K. Cutts: Right.
White: It comes in… Cutts: No wonder my page load
is failing. A one–
yeah, a one-megabyte download– Okay, so we have pretty good
connectivity here. My in-laws live in Omaha. So, like, every Christmas,
I have to go to Omaha. And I try to load
the Google home page. And for a long time,
they were on dial-up. How long do you think it took me
to load the Google home page, which is the most spartan,
austere page, like, on the Web? Like 12 seconds on a modem. If you want to load
a one-megabyte background, you can pretty much start
at, you know–at the beginning of this session
and then come back at the end of this session. And then by the time
you’re done, you’re like, “Oh, this site sort of sucks
a little bit,” right? So pay attention
to the fit and finish. Pay attention
to how it looks for the users, because if the users like it,
you’ll get more conversions. And it can be all kinds
of little things. For example, “about us.” Oh, good, I can find out more
about this person. Click. Click. Click. Crap, It’s not a link, right? So on one hand, this guy’s done
a pretty good job. He’s bought some sort
of template or something like that. So it’s got a “contact us”
and “about us” and “home page.” But he didn’t have
an “about us” page so he just left
that text lying there. And that’s not necessarily
very helpful. So pay attention
to the fit and finish. Pay attention to your images,
how fast things download. Please don’t
call your furniture “rottan” if you’re trying to spell it. And think about how the user’s
going to perceive it. There’s a lot
of key words here, right? He–How many times does this guy
use the word “furniture”? A lot, right? It’s not the case that if you repeat the word
“furniture” 100 times, you will do twice as well
in Google as if you say the word
“furniture” 50 times. I promise you.
We’re pretty smart about that. After the first two
or three times, it ain’t going to help
to say it 85 more times. So he doesn’t need to say,
“Furniture, furniture, furniture, furniture,
furniture.” If someone talked like this, they’d be the most annoying
person in the world, right? So you may work with SEOs.
You may be an SEO. And you may get this question, “Well, what should
my keyword density be?” Don’t worry
about your keyword density. If you have the word
on your page two, three times, Google knows about that word. Think about reading it
out loud. If it sounds
like something normal if someone were reading it
out loud, that’s pretty good. It doesn’t need to be
like you’re a bot saying the word “furniture”
over and over and over again. And there’s a few
interesting things here. He focuses
on the word “furniture,” but he could probably think
about other synonyms, right? Go to the Google keyword tool. There’s lots of keyword tools
around the Web. What are some other words
for “furniture” that he could use
instead of “furniture”? That’s the sort of thing
that users might type in. Okay, so here’s
a little interlude. Savetheinternet.com is dedicated
to net neutrality, Internet heroes and villains. It’s, uh, hosted by Free Press. This is a cool site. This is a site
which is about issues. This is a site about spreading
the word on net neutrality, which, personally, I think
is a pretty cool idea, except… See this, uh–
see, I was on savetheinternet. I’m going to go check out
their blog. So I go and I try to access blog/wp-content/uploads/authors
/pleddle.html. What do you think–
assuming the Wi-Fi holds– what do you think
I’m going to land on if I’m looking
at this site’s blog? Am I going to learn more
about net neutrality? Am I going to learn
about freedom of the press? If you look down here, now it’s waiting
for healthshoponline. Why is a site
about net neutrality linking me
to healthshoponline? It turns out this site
has been hacked. And if you visit the blog,
you’ll be offered VIAGRA, okay? So the point that I want
to get across– and here it comes. Canadian Pharmacy. Number-one Internet online
drug store, right? You guys, as developers, are due for, like,
the most concentrated attack of hackers
that you would believe. It used to be people
would get individual PCs and hack their botnets. And as PCs get better, as more people roll out things
like Windows 7 or switch to a Mac or use different browsers
that are more hardened, the clients are–
are getting pretty good. It’s hard to hack
one of these bots– one of these PCs. So now people are going
to be attacking your Web server. So you have to keep
your server patched. This savetheinternet,
I’ve been to it before. I was like, “Oh, what are they
doing selling pills?” And it turns out, you know,
if you click on “blog,” it says something like, uh,
“We’ve been hacked. We’re trying to figure out
what’s going on.” And yet the hack
is still live, right? So as soon as you find out
that you’ve been hacked, take it all down. Go ahead,
restore from backups. You have to apply your
server stuff really frequently. So stuff like WordPress,
for example, comes out with relatively
frequent updates, and it’s relatively easy. So now they’ve gotten
the blog back, but they don’t realize that
they’re still selling pills. So if you have a site, one thing that I would
absolutely recommend, just run it as a daily search, or set it up in, you know, uh,
just a cron job or something. Do site:yourdomain.com
VIAGRA. Or site:yourdomain.com pills,
or online gambling, or XENICAL,
or Hydrocodone. You know, pick your favorite
debt consolidation, mortgages,
spammy kind of query, because if you’re hacked, then you’ll find out
about it really soon, and you’ll be able to fix it
before it starts to effect your search rankings. Questions about that?
Make sense? Yeah. Yeah. Yep. Yep. Yeah. So the–the question is,
“If you use WordPress, you know, it’s pretty easy
to get hacked.” And the fact is, WordPress
has gotten much, much better. You know, they’re including
one-time keys, all sorts of nonces,
things that protect you. So 2.7.1, 2.7.0 are a lot less likely
to be hacked. But if you use
a stupid password– like, suppose their admin
password was “savetheinternet,” people could still guess it. People can still hack it. We’re actually in the process– there’s a–there’s a site,
google.com/webmasters, and whenever we see your site
get hacked, we will often try
to send you a message and say, “Hey, we found the
following text on your page.” Now, we can’t detect that you’re
hacked every single time. But one thing that we’ve done
is we’ve scanned for obsolete versions
of software. And we’ve said, “Okay, if you’re running
WordPress 2.1.0, you’re basically naked
on the Internet,” right? If you’re running a version
of WordPress, you know, 2.2.1, you’re still naked
on the Internet. And it’s not WordPress. It’s Joomla!,
it’s, uh, you know, all sorts of different–
Drupal– all these packages have had
holes at different times. So Google will try
to give you a warning, but we don’t guarantee you
that we’ll warn you. Yeah. [man speaking indistinctly] Yeah. [man speaking indistinctly] Yeah. Yeah. So we can take it offline
a little bit. But the basic idea is, if you notice
that you’ve been hacked, take your site down,
restore from backups, make sure that you’re
as secure as possible. And then you can do what’s known
as a reconsideration request, at least for Google. And that says, “Hey, we’re okay.
We’ve cleaned up the hack.” We also have
a Google Webmaster forum. And that’s linked to
from google.com/webmasters. So that’s a couple ways
you can let Google know that things are, uh–
are fixed. So this one’s
kind of an interesting site. This is HIV InSite at ucf–
ucsf.edu. Pretty cool site. Um, I noticed
a-a couple interesting things. Number one,
they’re really not using very much of this page. So if you want to see
what’s going on, you have to go
all the way down here to see there’s–there’s basics. There’s different sort
of things. The other thing– so I’ll go ahead and use
the rest of this space by making the font bigger. Um, check out this one. Recommendations
for Use of Antiretroviral Drugs in Pregnant HIV-1-Infected Women
for Maternal Health and Interventions to Reduce
Perinatal HIV-1 Transmission in the United States. What is that about? I have no idea. I’m not an HIV expert. Is this the very first thing
that you want your users to see? Maybe it is. Maybe this public-health site
really wants to highlight what are the important
medical papers coming out. But if you’re taking the time– And notice this says,
“Editors’ Picks.” So someone at UCSF has decided
that that’s an important paper. If you’re taking the time
to decide that’s an important paper, include, like, a paragraph
of beginner language. What does this mean? “Oh, if you are a new mother, “it will help you
if you do the following steps. Click here to read more.”
Right? So what’s really interesting
is if you go down to the bottom, there’s a section
called “Basics.” And I’ll just tell you,
you know, it’s possible
that you can go to Google, and you can, uh–
if you go to Google and–and stand in the lobby, you can see queries
streaming by. And if you ever do that, what you’ll find
is that typically, the sort of queries
that show up are not HIV transmission in maternal
blah blah blah blah blah. Here’s what people type.
Make it really big. [hums] You’d be surprised
how many people type those queries, right? “Aah! “I went to the doctor. “The doctor used this really
long phrase. “And I don’t know
if this is a serious disease or not, you know?” So people who come to Google
type in things like this. You’d be amazed. Something like 25%
of all the queries we get in a given month are unique. And it’s because
it goes straight down deep into the long tail. And they use beginner language. They are not using language
like, uh… You know,
here’s the Basics section. “What are HIV and AIDS?”
“Should I get tested for HIV?” How about this question. “I just tested positive —
now what?” “I just tested negative —
now what?” If you are designing this purely
for users, I would probably put this
section front and center, right? And move the research papers
a little bit further down, ’cause if I’m a user,
I have to click a lot further to find this. I just happened to see
the Basics section. This is the part that’s going
to be of a lot more interest to your average user. So, you know, when people
come to Google and they start typing this, this is actually
a pretty good match for this sort of thing. So there’s a ton
of good resources here, but you might think
about rejuggling a little bit so it’s
a little more crawlable, a little more,
um, understandable. So we call this, like,
the disconnect in jargon. Any of you guys
do industry sites with heavy– yeah, like,
technical language? Think about what the user
is going to type, right? That guy who was doing
iPod car integration, he’s completely ignoring
the 250,000 people last month who said, “iTrip.” He doesn’t even have a mention
of that product on his page. In the same way, this site,
while very good– in fact,
very crawlable, right?– sort of doesn’t necessarily
match with the terms that users are going to use. And you want to think
about the first impression you have whenever you land
on a page. Does that make sense?
Cool. All right.
Let’s look at another site. Ah.
What do you think? This is what my team sees
all day, every day, all day long. Uh, blogsense.biz. Uh, post number 1,276. So he’s done 1,275
previous blog posts just like this. “Radmaxx kicker fx,
phantom electric scooter, used electric scooter,
schwin, new.” So don’t do
this kind of stuff, right? If you ever get somebody
coming to you and saying, “Hey, uh, we’ve got an idea
to generate “a lot of envelope pages,
hallway pages, doorway pages, shadow domains,
whatever you want to call them.” A lot of things that– “Oh, it’s okay.
there’s no spam on your domain. “We put the spam
on other places, and then we sort of funnel them
in your direction.” Users hate this kind of stuff. Check out the previous post. “Aamaryland Child Support
Check myIowa,” right? It’s pretty much
complete gibberish. And I hope that programmers
in the audience are not thinking, “How could I
generate a page like this?” I’ll tell you right now, they use things
like Hidden Markov Models. They scrape search results. They try to find out
search results– search queries
that people have done. But the nice thing is this stuff
is pretty detectable. Uh, it’s relatively rare
to see pages like this on the Web these days. Or they exist on the Web,
but you’re less likely to find them
through search engines. Notice that even the spammer’s
getting spammed. He’s got a WordPress template,
and it says, “Download movies,
mortgage calculator gadgets,” and he’s got little links
down at the bottom. So a lot of this stuff
is just completely awful links. The reason why I wanted
to show you this one is a lot of the times,
people come to me– And I’m guessing almost everyone
in this audience is going to be, “Okay,
should I break my articles “into multiple pages? “You know, what if I have
a one-page thing “where I can print
the entire article? Is that going to be
duplicate content?” It’s amazing to me
how many white hat people come, and they’re worried
about something that’s, like, you know– I can understand
why they’re worried, but they really don’t need
to worry nearly as much as these guys. The black hat guys
don’t skirt the line. They go–boosh–
right through that barrier and break through it
and go as fast as they can toward the keyword stuff
and sort of spam stuff. So, uh, it’s natural
that people worry about different things. But most of the times, you don’t need
to stress that much. Okay, let’s look
at another one– Comfortfeetshop.com. What do you like,
and what do you dislike? Get a little
audience participation. What do you think is good
about this page? There’s some pictures. Someone else was saying
something else? Brands–
actual brand names, yes. Mephisto.
I’ve heard of that. ECCO.
I’ve heard of that. Okay, what do you dislike? Brands. There’s–there’s a lot
of brands here. Now, and you said– Yes, yeah, this–
things need to be centered. You know, it’s kind of weird that it’s left-justified here.
and centered here. That’s a little strange. The text is really weird.
Look at this. Oh, yeah, I would like to buy
some Cole Haan. I can’t buy it.
It won’t let me buy it. Please, give me
some Cole Haan, right? So number one,
that text looks a little– sub-optimal is a polite way
to say it. And number two,
it’s not even clickable. So if you do want to get
some Cole Haan, you can’t click on it
and get to it. It turns out the secret is… that the individual images
are clickable. So there’s a really good
principle of Web design, which is,
“Keep it simple, stupid,” right? And that is users are just
going to click on random things all over the place, everywhere. And so imagine–imagine
if you have an RSS icon, and you have RSS feed
as anchor text. Make both
of those clickable, right? In the same way you want
to make the image clickable and the text clickable. Don’t put it, you know,
a lot of the times, you can learn something
by saying, “You know what? Let’s get a regular user.” I swear, probably the best $20
you can spend is grab a regular person
off the street, put them in front
of your Web page, sit on your hands– you’re not allowed
to say anything– and watch them
try to buy, right? And so if they tried
to click on this– Of course, the Wi-Fi’s gone. If, uh–if they try
to click on a–on text and they can’t get to the brands
that they want to buy, they get really irritated
by that. While we’re waiting
for this page to load, what do you notice
about the URL? It is meaningless. There’s no way a user can remember
this particular URL. In fact, did you notice it has
an exclamation point in it? You don’t see URLs
with exclamation points in them very often. And, in fact,
if you looked at the home page, the, uh–see if I can hit
the back button without hitting reload–
the URL was this, right? So there’s clearly
a content-management system– a CMS–that is loading
the home page, which is homepage.template. I like to call this, “When you’re fighting
with your CMS.” You shouldn’t have to fight
with your CMS. Whenever you have, “Oh, yeah,
go to the product URL, it’s just gez!mep
yada yada yada,” users are not going
to remember that as much. Search engines are barely
going to be able to crawl that. So if you can go
to comfortfeetshop.com, see, even then,
it still redirects me to the deep URL. And so if you can,
it’s much better if you can stick
with short URLs, if they can be very memorable, if you don’t have to fight
with your CMS. And the thing is, it’s 2009. We shouldn’t be fighting
with CMSes at this point. If there’s a CMS
that isn’t built to be search engine friendly, consider going
to a different CMS, right? Be that voice that says,
“You know what? “If the CMS hasn’t even thought “about what search engines
might do, “maybe you need to switch to something
a little bit different.” Does that make sense? Yeah, question. All right, so we’ll do this–
we’ll do one go around, which is
this power-user question, which is, suppose
I’ve got two links on a page. They both go
to the same place. Should I put a rel=”nofollow”
on one of those links? So first let’s back up and talk
about what nofollow is. Nofollow is a simple attribute
that you can put on links that says, “you know what? Don’t necessarily float
page rank. Don’t necessarily float
anchor text across this link.” And my short answer is no. In general,
whenever you’re linking around within your site,
don’t use nofollow. Just go ahead and link
to whatever stuff. And the site architecture
that you decide to use decides where the page rank
flows within your site. That’s why if you have
a treelike structure and you have
the most important stuff first, it’ll get the most page rank, because most people link
to the root of your page. So you remember
the HIV InSite one, right? They had
the medical articles up front, and then the basics
were down here, buried two or three
mouse clicks deep. That’s a good argument
to swap those two. You’ll get more page rank
for the Basics and the deep medical articles
that less people are interested are further down. So a good way
to think about it– and it works for search engines
as well as for users– is how many
mouse clicks away am I? How many mouse clicks
does it take to buy your
iPod car integration? How many mouse clicks
does it take to buy your women’s sandals? So that is almost
kind of a good proxy of page rank, because page rank
is like a random surfer. Whenever we find a page, we extract
all the outgoing links. We take the out degree. We divide your page rank
by the out degree. And page rank
more or less flows in that proportion
to all those pages. So the closer you are
to the root page in terms of links, the more page rank
your page will have. So if you have to clink
17 links to find, you know, a particular page, that means Googlebot
has to click 17 links to get to that particular page. So the amount of page rank
will be much, much less. That’s why you should think about putting
your most important stuff relatively close to the root. Now, I’m talking
in a virtual sense. Imagine if you click one link, and it gets you three
directories deep. Is that a problem? On Google, it’s not. We only look at,
“Oh, there was one link, “and it went, you know,
three directories deep within the site.” That’s totally fine.
It’s just one link. There are some search engines
who probably look at the number of depth
of the directory. So if you can make it
relatively close to the root on the number–
the number of subdirectories, that’s not
a bad practice either. Then you might do
a little bit better on Yahoo! You might do a little bit better
on Microsoft. But on Google, you don’t need
to worry about that. So the question, pulling back, is, should I put nofollow
on one of these links? No.
Let all the page rank flow. The only rank time
I’d really use nofollow on your own internal links where you say,
“No, don’t flow any page rank,” is maybe a login page, right? Think about Expedia. If I go to Expedia
and there’s a login page, what good is it if Googlebot
crawls that login page? We don’t have a credit card. Googlebot’s not going to fly
to Vegas for the weekend. So you don’t need us
to, you know, go and visit your login page. So that would be a good one
to put nofollow on. Don’t–Don’t bother to waste
the page rank on, you know, uh, that sort
of stuff happening. Mm-hmm.
Tell you what… maybe–maybe take it offline
and catch it after. Okay, now we reach the very
active participation part. I want everybody
to close your eyes. I still see open eyes.
No, seriously, close your eyes. Close your eyes. While I have your eyes closed,
I’m going to bring up a site. Keep your eyes closed. Okay, open your eyes.
Snap judgment. What’s this site about? What does this site do? No idea.
Closebys. “Closeby services
meet Close friends to share the small things.” What the heck does that mean? I have no idea.
Okay, well, let’s dig deeper. “3 cents to 10 cents/ACElet
upto $110/year.” What does that mean?
I have no idea. Okay. Um, “Everyone qualifies
for 3 cents.” Yes! I’ve made 3 cents.
What does that mean? I have no idea. Okay, so let’s start
digging deeper. Let’s start clicking around
within the site. “Hair Salons. Auto.
Maintenance. Home.” Now, if you’re looking
in the bottom left, every single one of them
leads to the same URL. They all lead
to closebys.com/services. Okay, so I’m getting
confused here. “Maintenance.
Health & Fitness. Travel.” Okay, let’s click on Travel. If someone can’t land
on your site and know what it’s about
in a second or two, you’ve lost probably half
your buyers or half of your conversions or half of whatever it is you’re
trying to optimize, right? So this is a good example
where you get that person, and you say,
“Okay, I will give you $20 to use my site.” And probably
the first five people to use that site are like,
“What–What is it?”, right? Oh, we sell ACElets.
What are ACElets? ACE is an announcement,
coupon, or event sent to you
by your local business. Okay, so now we start
to get to it. You register for ACElets, and then you can get these sort
of offers sent to you, and they can be coupons. So that can be pretty useful. Um… start to look
into it a little bit more. Services pay per customer. It’s a little bit
opaque, right? What does Closebysocial have to do with businesses
allocated, right? And so this is another one
where you want to sort of think about the fit and finish
of your site. I-I–We were doing
a practice session at Google. And I was saying,
“Okay, close your eyes. Open your eyes.
And what does this site do?” And somebody was, like,
“They sell guns!” “What?
Why do they sell guns?” “‘Cause there’s an NRA show.” “NRA. Maybe the–
Oh, that’s the restaurant. Okay.” They seriously thought, and I seriously thought
until I moused over it, that this was
about selling guns, right? So think about the overall fit
and finish. Um, and here’s
a simple example. “Upto” is two words, right? That’s like the fifth word
that you see whenever you start
reading the page. So think
about the little things. Think about
the fit and finish. Think about–There’s actually
a pixel missing here, which you might
or might not be able to see. And so how–how professional
do these things look? Um, as you click around, it’s actually
kind of an interesting idea. If someone wants to save money, they can sign up
and get these announcements, coupons, and events,
stuff like that. So it can be pretty handy. But you’re not doing
a very good job of communicating it to people. Now, I’ve–I’ve been
a little sarcastic. I’ve been a little mean. So what’s good about this site? Anybody want to chime in?
What do they think? You won’t–You won’t fall asleep
looking at it. That’s true. It looks a little
like a MySpace page. Got the glitter. You know, that would sort of
complete the look a little bit. Um, I think this
is kind of interesting too. This is a banner ad on Closebys
for Closeby services. And if you click on it,
you go back to Closebys, right? Fit and finish. It’s not all
about search engines. It’s also about your users.
It’s also about your visitors. Now, it’s not completely
the same URL. In fact, if you look at it, it’s actually
closebys/index.html. And that brings me to a topic
that I wanted to mention. You would be amazed
how many people have got
closebys.com/index.html, index.php, noindex.html,
www.closebys, non-www closebys. So one of the big pieces
of advice that I give is look at the internal linking
on your site and be consistent. You would be amazed–
literally amazed– at how many sites have a www
and a non-www version of their Website. Now, why would that matter
to you? Greg, Brian, why would it be
a bad thing if I had two different versions
of my Web page? Grothaus: Well, one–
for one reason, if people are going
to your Website, uh, they’ll go to one page
or the other. And as a result,
they’ll–they’re sharing that Website
with their friends. They’ll send you one
or the other. And all of your links
on the Web will end up being split between those two different
versions of the site. Uh, Google’s pretty good
at noticing this. But sometimes when we see
that maybe these two pages look different,
maybe we’ll crawl one version and crawl another version
at a different time. And you have changed your site
dramatically in between them. And so we’re a little–We can
get a little bit confused. Are these the same page,
or are they not? And maybe we’ll split
your link juice between the two pages
and not give you full credit for all your links
to one single site, which is what
you really want. Cutts: Absolutely.
We normally do a good job. We normally can say
these are the same. But imagine if this
is a dynamic page, right? What if right up
at the top of the page, you say “the current date is”
with a time stamp? And then we visit
www and non-www, and the pages look
a little different. We might not correctly
glom those together. So I’ve actually seen
a www have a page rank 6 and a non-www of the page
have a page rank 4. So they’re splitting
their page rank in between those pages. Whenever you can,
it’s really nice if you can kind of unify them
and bring them together a little bit. Questions about that? Yeah. Oh, speak up.
Shout it out. White: Can you go
to the mic, perhaps? Cutts: In fact, come to the–
come to the microphone. Guy’s like, “I didn’t want
to be embarrassed like this. Come on.” man:
Um, does it help to have, like, a redirecting,
uh, application or something like that
for those type of domains? Like, doing 301 redirects
or something? Cutts:
Yeah, absolutely. Um, it helps to have 301s
internally and externally. So Google
is a very large Website. A lot of people come to Google. And so a lot of people
register domains that sound like Google
or variants of Google. For example, porngoogle, googlesex, googlingfordollars, right? All these people, they are like,
“Oh, I’m going to make money off the Google
domain name,” right? And so they register
something something Google. And then we have to go
and say, “Hey, you didn’t invent
the word Google. “Can we have googleporn
or googlesex or whatever back?” And they’re like, “Okay, fine.
Here you go.” Now we have this domain,
googlesex.com. What are we going to do
with that? We’re not going to have
an unsafe version of Google that’s, like, only porn, right? Matt’s house of porn. Do all your searches here
and get all the sort of weird, fetish stuff you want. We’re not going to do that. So what we do instead
is we do a 301 redirect. And that’s
a permanent redirect. And 301 just refers
to the HTTP status code. You can pretend to be a browser
using Telnet. So if you Telnet to port 80
and say, “Get slash,” that’s saying,
“Give me the root of this page on your Web server.” And the–the response code
is 200, okay, if it’s a normal page. If it’s moved,
it’s usual 3-something. If it’s 301, it’s permanent.
If it’s 302, it’s temporary. So imagine that you’ve just
moved your Website for, you know, a few weeks. That’s the perfect time
to use a 302, if you’re going to be coming
back to it. Um, a lot of people
who do banner ads will do 302s,
things like that. But if you have moved
your site– if you have truly moved it
from sitea.com to siteb.com and you’re never planning
on going back, that’s when you can use a 301. And then all the page rank
flows, and everybody’s happy, and your rankings are preserved
and all that sort of stuff. And you can do an internal 301. That means if you go
to the non-www, you get a 301 to the www. So that’s really nice. You’re not linking to it. You’re just–voosh–immediately
taken in there. And a lot of sites
like Slashdot. If you go to www.slashdot, it’ll probably redirect you
to just slashdot. And that’s pretty smart. Then people
can’t really get mixed up. If you try to link to it
and you visit it, you’ll get redirected
to the non-www and then you end up usually
linking to the non-www. So that’s a great way
to take care of those
sort of duplicate issues. White: I’ll add to that too.
Cutts: Yeah. White: A number of Web hosts
have already taken care of that for you. On my Web host, um, I found that when I typed
a non-www-version domain name, it automatically
redirected over or vice versa. So that’s sometimes
taken care of for you. I want to make a plug, too,
for Live HTTP Headers extension in Firefox. Does anyone use that?
Raise your hand. Yeah. That’s a–
That’s a nice one. Briefly, it lets you see
the requests and the responses between the browser
and the server. In–In effect,
you are kind of– I don’t know
a better way to say it– but wiretapping
your own, uh, connection between the servers. So you can actually see,
um, what the response codes that come through. You can do–
Use Matt’s version. Telnet. You can use
Live HTTP Headers and make sure
that 301 is happening from www to non-www
or vice versa. So, yeah, that’s something
you can add. Cutts: It’s a really
useful tool. You would be amazed
at how many sites return very strange stuff,
um, you know, in the headers. And so if you’re looking at
those headers, you’re realizing, “Oh, something strange
is going on.” And you can do it yourself. You can go down to the wire
and Telnet and all that stuff. But it’s much easier
if it’s just a Firefox add-on. So it’s a lot of fun. One more plug,
and I don’t want to– We are at a Google conference,
so I don’t want to be, you know, too salesy, but there’s
google.com/webmasters. That’s the Webmaster console. If you haven’t registered, it’s not negligence
if you don’t use it, but it’s a really, really,
really useful tool. For example, if you say,
“Okay, I own mattcutts.com,” we have an option
where you can say, “What’s my preference,
www.mattcutts.com or non-www?” We also show you errors
that we found when Googlebot crawled. We even showed you
how long the latency is for Google to fetch pages
from your site. So if you’re having
server issues, you can sometimes
diagnose it. So we’ll show you all kinds
of really useful information. And the ability to set
that sort of www or non-www is very helpful
in that kind of case. Question. Yeah.
At the microphone, yeah. man: Um, yeah,
the, um, question before about variants in– Um, for example,
I use Analytics extensively. Cutts: Mm-hmm. man: And I’ll get a return
for the URL– the URL plus, uh, index.htm– um, several variants. And I don’t think
that’s a question– I think I’m consistent
within the Website. Cutts: Mm-hmm. man: But, um,
which kind of contradicts what you just said. Cutts: Oh, no, you always
want to be consistent if you can. man: Yeah, but I mean,
those external accesses, how did they, um, decide,
you know, to access my Website when
I was consistent internally, I guess, is the question. Cutts:
So most of the time, it’s just people
are not that skilled. Like, they’re going to link
to Slashdot, and so they just assume
it’s www.slashdot, even though Slashdot
wanted to be without the www. So a lot of the times,
if people– You would be amazed
at how many people will link to you
with broken stuff. In fact, you know, to plug
the Google Webmaster console, we will show you
all the backlinks to your site. And we’ll also show you
all the 404s on your site. So one of the best tips
that I always give is, if you would like free links– and most people on the Web
would like free links– you can go to this tool
and you can say, “Who are the people
who link to 404 pages?” And they’re going to be
very reputable people. There was, uh–
ZDNet had linked to my site, and they’d linked to a page
that didn’t exist with, like,
two trailing slashes. And so if you contact
that reporter and you say, “Hey, uh, you know,
double slashes on the Internet are weird,
just link to the normal page,” hey, free link. Now it’s not a 404 page.
It links right to me. The fact is, anybody can link
to anybody else on the Web. So we don’t penalize people
if a spammer links to you. The nice thing is,
if someone links to you that you normally link to
correctly, but they don’t always
link to you correctly, by being consistent
with your internal linking, you try to prevent at least the worst of stuff
from happening. Good question.
Yeah. Another question. woman: Um, we, uh, work
at the University of California, and we’re not allowed
to use some Google tools because of an indemnity clause. And I’m wondering
if there’s an indemnity clause associated
with the Webmasters. Cutts: Uh, that’s
an excellent question. Catch me and get
my contact info afterwards, and I’ll–I’ll try
and look it up for you. Yeah, this is one thing
that, um– I talk to a lot of government
Webmasters, right? ‘Cause they’re like,
“Okay, we would like to find out “what the broken links are
on our site, but, you know, bureaucracy
can get a little overwhelming.” And so sometimes people
literally just go rogue. They’re like,
“I’m just going to prove “that I own this Website. “I’m going to edit
this meta tag. “No one needs to know. “Okay, now I’ll log out. And I have
the information I need.” And I’m not condoning that.
I’m not sanctioning that. I’m just saying a lot of people,
turns out, they do that. Uh, but everybody has different,
you know, restrictions on when they are
or aren’t allowed to do it. Um, I’ll have to check
on whether we have some sort of indemnity clause
on–on Webmaster central. Good question.
Another question. Yeah. man: Hi. Um, interesting
that you’re from Omaha. We just, uh–we sold our company
a couple of years ago, NetShops. Um, and because
of non-competes, we–we really couldn’t sell
much of anything online. So we started
that racy.com site. And I understand
why you didn’t put it up. But I-I’m very, very curious, are we starting from a hole
because we’re trying to– Uh, it’s not pornographic,
but it’s on the edge. And I’m just wondering
what, uh–what the take– is it–is they’re also history
of the domain name that is bringing
along problems? And how, uh–
how can we resolve– What–What should we do
in this situation? Cutts: Absolutely.
Very fun question. So we took a bunch
of submissions. Um, one of the submissions
was racy.com, which I have no problem with. I put it up, you know, in our internal
training session. We were looking at it. There’s some
pretty explicit stuff. In fact, last night
I was trying to find some part of the site that would be safe,
you know, to project. And I was like,
“This is a little–” It’s not radioactive. It’s just, like,
there will always be a couple objectionable words, so I decided to not put it up
on the projector. Uh, but as I recall,
you had good URL structure. So let me turn that
into a more general question. I just bought a domain, and I’m worried
the previous owner might’ve done something unsavory with it. It turns out, in your case,
that didn’t happen. But here’s what you can do, go to archive.org
and look at www.racy.com or example.com
or whatever your site is over history. In fact, if you’re thinking
about buying a domain name– and we have seen people
buy domain names where a spammer took it, and they burned it
to the ground, right? They just did everything.
They spammed– You remember the, uh… this–this–this guy– Where’s my spammer?
There’s my spammer. Blogsense.biz. A lot of the times,
people will buy this domain, spam the hell out of it,
do as much stuff as they can, and as soon
as all the search engines have caught them and, like,
deconstructed their spam house brick by brick
and buried the bricks in different places
and then salted the ground so that this domain
does not rank well, right? Then they try to sell
the domain, right? And a lot of the times, if your
traffic is going like this and then the search engines
find your spam and it goes like this, they’ll take a picture
when the traffic was like this. And they’ll be like, “Oh, yeah,
look at our rankings. They’re really climbing.” So the way to protect yourself
against that is a couple things. Go to archive.org and search, and you can see the historical
amount of information about the domain. You can see
what the domain looked like three years ago
or four years ago or five years ago. The other thing is,
just do a search for the domain. I think I can do a search
for racy.com, um… as a phrase, right? So racy.com
is right at number one. That’s a good sign. And you’ll notice
now you have dictionaries, that sort of stuff. If what you see is people
complaining about your site, people saying,
“Oh, I can’t believe racy.com. Those guys are spammers,”
or something like that, then I would be worried. But in fact, I don’t think
that you necessarily have any issue with the previous
owners having issues or something like that. Feel free to catch me
afterwards, and we can talk
about it more then. Okay, more questions
from the audience? Yeah, come on up to the mic. I’m figuring you guys prefer
a little more interactivity, but I’m pretty– man: I have two questions.
Cutts: Cool. man: Do you treat 500 responses
different from 502s? Cutts: Mm-hmm. man: The second one
is, um, images. Do you guy–
How–how would a– Let’s say we have
a thumbnail version, and we have a full version. And we prefer you index
the full version, but we’re only showing
the thumbnail on a particular article. You can see an example
of that on ehow.com. Cutts:
Yeah, absolutely. So 500 versus 502– if you want
to catch me afterwards– I think that we do
treat those separately, but I’m not 100% sure. Um, there’s a lot of different
HTTP status codes. So 300s typically stand
for redirects. 200 usually means okay. You know,
400 is some sort of error. And then, you know,
they mean different things. Um, typically whenever
we’re doing image searches– so let’s do an image search
for flowers– um, we will try to find
the best picture that we can. And if there’s a thumbnail,
we will do our best to say, “Okay, let’s find the one
that’s a little bit bigger,” because users are typically
a little more interested in the one that’s bigger. One thing you can do is try
to shunt a little more page rank to the larger image. That can help out quite a bit. Um, we can also talk
about it offline too. Did you–Okay.
Yeah. man: Hi. I had a question
about a good site. If we could just look at a site that follows some
of the best practices. And, uh–And I also had
another conference where they talked about video
being really important on a Website to boost, uh,
ranking and also indexing. Cutts:
Yeah, absolutely. So–And the funny thing is,
uh, what was it, Closebys–
I may have lost it. Um, Closebys actually had
a very high-production video, and they buried it,
like, six layers deep. So, uh, a lot of people
ask me, you know, “Hey, I want to get
number-one search rankings.” It turns out,
sometimes it’s easier to get number-one
search rankings by making a great video
or a great blog post or a great image
or something that’s newsworthy rather than just a pure
Web page. Um, video, you would be amazed
at the sort of– you know,
if you go to YouTube– the amount of traffic
that some videos can make. In fact, uh,
if I were to search for, uh, Google Webmaster
video channel– One thing that we’ve been
doing recently is not just recording– not just doing blog posts,
but doing videos. And they’re fantastic, because you can take
a Flip, right? And Flip video costs… 100 bucks,
a couple hundred bucks, right? And you can–I mean, it’s like
the size of a cell phone. You can carry it
with you everywhere. And so if you see something
interesting at a conference, you just whip that out, and
you can video it very easily. So there’s no reason why you have to put a ton
of work into a blog post. Sometimes you can do it
a lot easier with a video. Uh, I was doing something
where I was hacking a Nintendo Wiimote. And I was like, “Oh, this
would look good with a video.” So you just whip out the Flip,
upload to YouTube. You’re done in no time at all. So we’ve discovered
that doing– We’ve done something
like 50 different videos. And, um, what we do
is about every month or two, I take questions. So, you know, “How important
are brands and ranking?” “Does anchor text
carry through 301s?” I can answer that question
in a minute and a half instead of 20 minutes
or 40 minutes or an hour doing a blog post. And you can see, you know, 14,000 people have watched
that first video. So two minutes to make, and then you can get a ton
of views. Video is a very easy way,
where, if you are willing, if you have the ability– You have to worry
about indemnity. You have to think, “Do I have permission
to make these videos?” But if you do, you’d be amazed
at how viral they can go. Um, one tip that I learned– You notice
every single one of these has me in a red shirt, right? So I sat down, and I answered
30 questions all on one time. And by the end
of 30 days, everyb– I was doing one
of these videos a day. By the end, everybody was like,
“Okay, no more of the red shirt. “Please burn the red shirt.
“I’m so tired of this red shirt. Ah, I hate it.” So I did another one recently. And I brought in eight
different T-shirts. And I do about five questions, and then I’d switch
to a different colored shirt. So, you know, these things
are very fast to make. You can make 30 or 40 in a day. And it can be a way
to rank really well. In fact, if you were
to do a search for my name, um, you’d probably get my blog. You’d get, maybe, you know,
something on Wikipedia or whatever. But you–There’s something
like five videos ranking for my name. So it can be a very easy way
to rank. And there’s not
that many people that are thinking
about making videos, so it can be
a really big opportunity. Yeah, another question
at the mic. man: Uh, well, our marketing
people really like PDFs, and so they create them first. And then, um–and then
I’m one of the people who turn them into HTML. And I keep telling them to do– You know, “We need to do it
the other way.” Uh, could you speak
about PDFs versus HTML and what’s likely
to be most searchable and et cetera? Cutts:
Absolutely. So–and just before I go on,
here’s a couple videos. Qualities of a good site and the lightning round–
a bunch of questions. And that’s ranking
with, like, no links compared to–these guys
might have hundreds of links. So videos can be
a really big opportunity. Um, yes. PDFs. Another one I get
is Flash, right? Um, I love Adobe. Adobe has done fantastic things.
Photoshop is cool. The plug-in aspect of Photoshop
was revolutionary. They do amazing things.
Postscript, fantastic. Flash and PDF
are both very, very cool. But they’re not necessarily
native to the Web. And so people
can usually read PDFs, but not always. And you want it to be
as easy as possible for people to buy,
for people to learn, for people to do whatever it is you want them to do
on your site. So think about that
a little bit. I would totally agree
if you can make Web pages and then go to PDF, that can be
a lot more accessible. It’s a lot easier to link
to an HTML page. So HTML pages tend
to attract more page rank a little bit faster, ’cause people
have to link to PDF, and then they have to say
it’s a PDF link, all that sort of stuff. And just on Flash for a minute–
Flash is fantastic. But it’s better
to be decoration in the middle and not the entire site
in Flash. If you can make
static HTML links that Googlebot or users
can click, that is so much better, because we can find
a lot of individual pages, which gives you a lot
of individual opportunities to rank. If it’s just one big glob
of Flash, that’s harder to link to. Anybody able to think
of another reason why you might not want
your entire site to be in Flash? Right? I have a G1,
and I have an iPhone. The iPhone
does not handle Flash well. And who knows
when that will change. So there’s
a very great yogurt site– Pinkberry, right? And if you go to Pinkberry
on an iPhone, you’re stuck
with a little Flash cube. You’re like, “I want to know
where I can get yogurt. Give me the yogurt.” And you can’t find out. There’s another site,
you know, Red Mango, and they do a better job. So, you know,
think about Flash. Think about PDFs. If you can do it in a Web way, I would start with a Web page
and then go to PDF. Feel free to say,
“Yeah, Matt said so,” or something like that
if it carries any weight. Often it doesn’t. Question. Yeah. man: I had a question.
Can you explain how the pa– Google decides
to index a page or not? I’ve noticed the page number
in our site up and down, even though the content
has not changed. Cutts:
Absolutely. Um, the short answer is Google
basically uses page rank to decide which pages to crawl
and how fast to crawl. So in the early days, you could almost think of it
as sorting by page rank. Stanford.edu, CNN,
“New York Times”– tons of people link
to these sites. And so we start crawling
with those sites. And then you find all the links
from”New York Times,” and you go outward like that. So it’s almost entirely–
at least in the beginning– a matter of how many links
there were and what the importance
of those links were. And by the way, page rank is not just the number
of links, right? Like, if ten people link
to Greg’s Web page and 15 people link
to my Web page, you might be like,
“Aha, I got 15 links.” But if his ten links
are “New York Times,” CNN, “Reader’s Digest,” and my 15 links
are my college buddies, he’s going to have
more page rank. So we’re going to crawl
his page first. And the same thing applies
to refresh policies. So we’re more likely
to visit stanford.edu, you know, once a day to figure out
that a page has changed than we are somebody
that nobody ever links to. That said, we try to do
a very good job of crawling as much of the Web as we can. Even if you don’t have any links
at all on your dot-com, we can often still discover
your site. You can always submit your URL
and things like that. But the more people
that do know about you, the more we’re often able
to crawl and the more often
we’re able to crawl you. So typically, we can refresh
our entire index on a weekly or, at worst case,
a monthly sort of basis. Another question.
Yeah. man: I just had
a couple quick questions. Uh, one of them regarding
Google image search. Where do you guys– What–What text do you look for
to actually index the image on? Is it just the ALT text
and image tag, or maybe like title text, or do you actually take stuff
previous and after it? Cutts: The short answer is
we’re willing to look wherever we can find
useful data. So, I mean, in theory, you could look at the title
of the URL, definitely stuff surrounding
the page, absolutely the ALT text. Um, it’s very hard to understand
the actual content of images. Google just released
on Google Labs something that lets you sort
by image similarity. So you can start
with the Eiffel Tower, and it will show you other
pictures of the Eiffel Tower. We also have something where,
on Google image search, you can say, “Show me faces.” So you can do a search
like “Paris.” And then click
“show me faces,” and you’ll only get
Paris Hilton. So we do a little bit
of image understanding, but most of the time, it’s text around or within
the metadata of the image, that sort of stuff. The more descriptive stuff
you can have, instead of, like,
if you call your .jpg dsc129.jpg, that’s not as helpful
as flower.jpg. So think about all that stuff
and try to incorporate the data. man: And if you throw
an image tag in an H1 tag, will it get more authority
like the ALT tags do? Cutts:
This is a common question. H1, you know, is bigger
on the page than H2, which is bigger than H3. And so a lot of SEOs
are really, like, “Okay, everything’s
going to be in H1.” And it’s kind of nice
we have this page up, right? It’s the same attitude. I’m going to say the word
furniture 40,000 times. I’m really about furniture. If you type in furniture,
you have to return me, Google. And we don’t buy that. So if you put
your entire page in H1, uh, we’re not going to give it
the same weight that we’d give, like, two
or three words. So we try to be pretty,
you know, intelligent about that. I would just do whatever
is natural. You know, it’s natural
to have an H1, have a few sections,
have a few words, and have an H2. I wouldn’t try to use CSS
to make the whole page H1, because we do a pretty good job
of detecting that. Yeah.
Another question. man: Yeah. So I-I guess
what I’m hearing from you, that if you, say, have a page–
you have, like, an index.php, and you give it query parameters
like page=something, that’s worse than if you had
separate pages, you know, like, you know,
whatever, index.php and, you know, item,
you know, or, like, you know, iTrip or whatever .php,
stuff like that. Cutts: Yeah. So that’s
a fantastic question. Imagine this page
is index.php and val1 or, you know, item1=val1 and item2=val2 and you go on and on and on
like that. It used to be that Google
would only crawl URLs if there were, like, one
or two parameters in the URL. We continue to get better. So you can have 15 parameters
in the URL, and we can still crawl that. Like, if you looked at some
of the weird URL structure we had with exclamation points
in it, we’ll still crawl that. But in general,
for most search engines, if you can make something
that’s relatively static, instead of having
15 parameters, it’s easier to crawl, because if you think about it,
you know, if you have 15 parameters, usually at least a few
of those parameters do nothing. So if you take
that parameter out, you get the same URL. And now you’ve got weird
duplicate content issues where the same content is
presented on two different URLs. So as much as you can reduce
those extra parameters if you don’t need them, and, you know, if I were
looking back at this guy, rather than having
all these–whoops– rather than having
all these weird terms, again, if it could be
/audi/s4/2006, that’s so much more
understandable for a user. So I think
that can really be useful. It can–
you know, it can help out. We do give
a little bit of weight for keywords in the URL. It’s not like
that’s the secret. You put a keyword in the URL
and now I rank number one. But every little bit helps. So if you can have a few things
that make sense for users, where they can see the URL and get rid of these sort
of multiple parameters, it does help a little bit. And I think we’re pretty close
to the end of the session. How about
one last question. man: We put a database online
that was unique information. We just did it. And I know the guidelines
about length and number of links, but are those
just guidelines? Can we break that
in a situation that requires it? Cutts:
Very good question. So if you look
at Google’s guidelines, there’s one thing that says,
“Avoid having over 100 links.” And that’s not
in our spam guidelines. We’re not going to penalize you
if you have over 100 links. That’s in a different section. The reason
why we said that originally is because we used
to only crawl 101 kilobytes. And so we didn’t want people
having a four-megabyte file. Remember, when you have
a one-megabyte image download, that can be really
kind of a pain. So originally,
we recommended 100 links just so that people wouldn’t
have their pages be too big. Now we crawl a lot more– We’ll save a lot more
than 101 kilobytes of your page. So that guideline
is getting a little stale. You know, having up to hundreds
of links on your page is okay. What you really don’t want
is a ton of really spammy links. You know, the sort of thing like, uh,
the blogsearch.biz guy, like this. If you had 100 links
like this, that would get really,
really annoying. But if you
have just normal links, that I wouldn’t necessarily
worry that much about. Okay, so thanks very much. If there’s people
who have questions, we’ll stick around
for a little while afterwards. Thanks. [applause]

44 thoughts on “Google I/O 2009 – Site Review by the Experts

  1. Brian Mack Post author

    Awsome, a well made video from google about building a site for SEO, I don't have to worry its not correct. The speaker did a great Job! Im building my first website, perfect timing for the release of the video.

    Reply
  2. Mamsaac Post author

    This video is ridiculously useful. Thank you very much for submitting this =)

    Reply
  3. Calvin Webster Post author

    great video. matt is such an excellent presenter.

    Reply
  4. Joe Oviedo Post author

    Great talk! I have now increased my SEO Powers! =)

    Reply
  5. Shawn Cheng Post author

    This is a great video Matt! Really helpful for a SEM guy. Great to see what happens after I actually get the users to the site. = ]

    Reply
  6. marcuzzer Post author

    great video! am i the only one who thinks the volume is a little bit too low?

    Reply
  7. James Miller Post author

    HOLY CRAP!!!!!!!!!
    who the hell would want to watch one hour of this?
    like wtf

    Reply
  8. Nick Mecca Post author

    Ok, why am I seeing so many comments saying "wow 1 hour!? why would anyone watch all this shit!?"

    This is for Google DEVELOPERS. That's why it's so long and *seemingly* dry. It's not meant for people who just SURF YOUTUBE TO WASTE TIME, it's for people actually DEVELOPING websites for Google. It's not intended to be ENTERTAINING for the GENERAL PUBLIC.

    Reply
  9. Chairs Post author

    you fucking morons, if anyone's into usability engineering at all a 1 hour lecture is nothing.

    Reply
  10. Curtis Penner Post author

    Why do they get to have a 1 hour video? Because they own the place…duh!

    Reply
  11. terminator689 Post author

    The audience is like dead or something!
    They are not interacting at all with the guy!

    Reply
  12. jake262144 Post author

    Your comment is too immature/dumb so guess what – I don't give a cr*p about it either ^^
    Seriously, dude why take time to create such poor comment? Too freaking long? Don't watch it.

    Reply
  13. Lighter In The Storm Post author

    you can actually purchase more time, but hey. Google is multibillionier company.

    Reply
  14. Lighter In The Storm Post author

    well google resurrects it in the end of the video. 😀

    Reply
  15. Jimmy Browder Post author

    I learned enough to have had to pay for this hour of tutorial help. Best wishes to you guys! Jimmy Browder

    Reply
  16. netandful Post author

    i told my webmaster to give me the conclussions, he was right!

    Reply
  17. DiamondLight Post author

    I enjoyed this video alot, please come to RIT sometime, I want my website reviewed

    Reply
  18. markistc Post author

    who the fuck is saying great video? do you realize this is an hour long? who makes an hour long video what is this Pixar!

    Reply
  19. Chairs Post author

    @Xrockstar147X
    ill give it to you in a nutshell:
    you dont need to be an expert to design a website
    just have common sense
    trust google is doing its job properly and talk as if you're talking to a human
    things should work how the user expects them to
    [ps i'm currently finishing a dissertation on usability engineering. final year computer science student]

    Reply
  20. Jesse Smith Post author

    do people actually use internet on this?

    Reply
  21. Manoj Narayan Mishra Post author

    Great Vedio!!!!!!!!!!!!!!!!!!!!!

    Reply
  22. poop pants Post author

    Your pain is the breaking of the shell that encloses your understnding. it is the bitter potion by which the physician within you heals your sick self,. therefore, trust the physician, and drink his remedy in silence and tranquility

    -kahil gibran

    Reply
  23. Siva Ganesh Post author

    nice.. although they review only few site. 2010 I/O has pretty much good no. of sites. 🙂

    Reply
  24. KingDavidTV Post author

    A fantastic education for all webmasters and bloggers. Commonsense but its good to know how the people at Google search think and what their bots can and can't do.

    Reply
  25. erik ponty Post author

    Thanks for submitting this video, going to tell my dad about this so he can improve his companys webpage! Thanks again 🙂

    Reply
  26. trident3b Post author

    @marcuzzer it's low. I have mine flat out to hear it normally… sadly.

    Reply
  27. Jashan Singh Post author

    @zwptsco I'm making $900 and day and still going strong. Finding the right niche is key… without that you've got nothing. If you wanna know more the Cash Lab has a sweet ebook and its free, It wont always be free so sign up now: bit.ly/LtS1Ao?=bcxuyo

    Reply
  28. Andy Gold Post author

    Site developers need to watch this video and take notes! Even a quick review once in a while helps! Make a note of this video and other Google webmaster videos in your own playlist. I will put this video up at my Websites-Make-Money site and make some notes about it.

    Reply
  29. Irma Komala Post author

    Anyone tried the MoboRank (search on google)? I've heard some extraordinary things about it and my buddy improve his website's google ranking by using it.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *