How does Google Search work?

By | August 16, 2019

MATT CUTTS: Hi, everybody. We got a really interesting and
very expansive question from RobertvH in Munich. RobertvH wants to know– Hi Matt, could you please
explain how Google’s ranking and website evaluation process
works starting with the crawling and analysis of a site,
crawling time lines, frequencies, priorities,
indexing and filtering processes within the databases,
et cetera? OK. So that’s basically
just like, tell me everything about Google. Right? That’s a really expansive
question. It covers a lot of
different ground. And in fact, I have given
orientation lectures to engineers when they come in. And I can talk for an hour
about all those different topics, and even talk for an
hour about a very small subset of those topics. So let me talk for a while and
see how much of a feel I can give you for how the Google
infrastructure works, how it all fits together, how our
crawling and indexing and serving pipeline works. Let’s dive right in. So there’s three things that you
really want to do well if you want to be the world’s
best search engine. You want to crawl the web
comprehensively and deeply. You want to index those pages. And then you want to rank or
serve those pages and return the most relevant ones first. Crawling is actually
more difficult than you might think. Whenever Google started,
whenever I joined back in 2000, we didn’t manage to crawl
the web for something like three or four months. And we had to have a war room. But a good way to think about
the mental model is we basically take page rank as
the primary determinant. And the more page rank you
have– that is, the more people who link to you and the
more reputable those people are– the more likely it is
we’re going to discover your page relatively early
in the crawl. In fact, you could imagine
crawling in strict page rank order, and you’d get the CNNs of
the world and The New York Times of the world and really
very high page rank sites. And if you think about how
things used to be, we used to crawl for 30 days. So we’d crawl for
several weeks. And then we would index
for about a week. And then we would push
that data out. And that would take
about a week. And so that was what the
Google dance was. Sometimes you’d hit one data
center that had old data. And sometimes you’d hit a data
center that had new data. Now there’s various
interesting tricks that you can do. For example, after you’ve
crawled for 30 days, you can imagine recrawling the high page
rank guys so you can see if there’s anything new or
important that’s hit on the CNN home page. But for the most part, this
is not fantastic. Right? Because if you’re trying to
crawl the web and it takes you 30 days, you’re going
to be out-of-date. So eventually, in 2003, I
believe, we switched as part of an update called Update Fritz
to crawling a fairly interesting significant chunk
of the web every day. And so if you imagine breaking
the web into a certain number of segments, you could imagine
crawling that part of the web and refreshing it every night. And so at any given point, your
main base index would only be so out of date. Because then you’d loop back
around and you’d refresh that. And that works very,
very well. Instead of waiting for
everything to finish, you’re incrementally updating
your index. And we’ve gotten even
better over time. So at this point, we can
get very, very fresh. Any time we see updates,
we can usually find them very quickly. And in the old days, you would
have not just a main or a base index, but you could have what
were called supplemental results, or the supplemental
index. And that was something that we
wouldn’t crawl and refresh quite as often. But it was a lot
more documents. And so you could almost imagine
having really fresh content, a layer of our main
index, and then more documents that are not refreshed quite
as often, but there’s a lot more of them. So that’s just a little bit
about the crawl and how to crawl comprehensively. What you do then is you
pass things around. And you basically say, OK, I
have crawled a large fraction of the web. And within that web you have,
for example, one document. And indexing is basically taking
things in word order. Well, let’s just work
through an example. Suppose you say Katy Perry. In a document, Katy Perry
appears right next to each other. But what you want in an index
is which documents does the word Katy appear in, and which
documents does the word Perry appear in? So you might say Katy appears in
documents 1, and 2, and 89, and 555, and 789. And Perry might appear in
documents number 2, and 8, and 73, and 555, and 1,000. And so the whole process of
doing the index is reversing, so that instead of having the
documents in word order, you have the words, and they have
it in document order. So it’s, OK, these are all
the documents that a word appears in. Now when someone comes to Google
and they type in Katy Perry, you want to say, OK,
what documents might match Katy Perry? Well, document one has Katy,
but it doesn’t have Perry. So it’s out. Document number two has both
Katy and Perry, so that’s a possibility. Document eight has Perry
but not Katy. 89 and 73 are out because they
don’t have the right combination of words. 555 has both Katy and Perry. And then these two
are also out. And so when someone comes to
Google and they type in Chicken Little, Britney Spears,
Matt Cutts, Katy Perry, whatever it is, we find
the documents that we believe have those words, either on
the page or maybe in back links, in anchor text pointing
to that document. Once you’ve done what’s called
document selection, you try to figure out, how should
you rank those? And that’s really tricky. We use page rank as well as over
200 other factors in our rankings to try to say, OK,
maybe this document is really authoritative. It has a lot of reputation
because it has a lot of page rank. But it only has the
word Perry once. And it just happens to have the
word Katy somewhere else on the page. Whereas here is a document that
has the word Katy and Perry right next to each other,
so there’s proximity. And it’s got a lot
of reputation. It’s got a lot of links
pointing to it. So we try to balance that off. You want to find reputable
documents that are also about what the user typed in. And that’s kind of the secret
sauce, trying to figure out a way to combine those 200
different ranking signals in order to find the most
relevant document. So at any given time, hundreds
of millions of times a day, someone comes to Google. We try to find the closest
data center to them. They type in something
like Katy Perry. We send that query out to
hundreds of different machines all at once, which look through
their little tiny fraction of the web that
we’ve indexed. And we find, OK, these are
the documents that we think best match. All those machines return
their matches. And we say, OK, what’s the
creme de la creme? What’s the needle
in the haystack? What’s the best page that
matches this query across our entire index? And then we take that page and
we try to show it with a useful snippet. So you show the key words in the
context of the document. And you get it all back in
under half a second. So that’s probably about as long
as we can go on without straining YouTube. But that just gives you a little
bit of a feel about how the crawling system works, how
we index documents, how things get returned in under half a
second through that massive parallelization. I hope that helps. And if you want to know more,
there’s a whole bunch of articles and academic papers
about Google, and page rank, and how Google works. But you can also apply to– there’s [email protected], I
think, or, if you’re interested in learning
a lot more about how search engines work. OK. Thanks very much.

100 thoughts on “How does Google Search work?

  1. Md Siddique Post author

    Don’t be confuse if you want to make your site to be the top rank on  search engine. This site will guide you to build your site to be the top one. Search "speed rank seo" on Google.

  2. Area Zak Post author

    There are many tutorial about how to make your zero rank site to be the top rank on search engine.  This site is clear enough to be your guide. But for more information, please search "speed rank seo" on Google.

  3. Vishnu Viswanath Post author

    appox how much is the size of index at google now? I have an app where i have indexed about 2 billion documents ~= 8TB of data and the index size is around 60gb. 

  4. Aditya sharma Post author

    I don't think google now rank pages have kati perry if they are writing side by side..for example let say user searched for "google search works" if you just search this in google you probably see pages on top position in which word "google" is in first line then is considered as best "search" engine in the world and it really "works" you see words of key phrase "google search works" in not in sequence and I don't see any site have those three words in sequence , appearing in top positions and I think this is the end of doing seo…Matt Google wants webmaster to leave thing completely on google's I right..its in 2015…

    Guys if you don't believe me just search for long tail keywords you will see this magic automatically.

  5. SalvadorDali22 Post author

    LOL did you see that jerk? Dropped his sponge lolooololol

  6. Bhavesh Gudhka Post author

    Very nice and useful video. Open Google and search for Optron SEO Training and you will find useful information. 

  7. BRBallin1 Post author

    The real question is how does it do all this so quickly and in under a second?

  8. akshat sinha Post author

    Thanku so much..
    I search this many times but none of any site or video can able to describe well…
    this is best fr those who are appearing fr there sem exam after 12 hrs… :/

  9. Chad Kimball Post author

    A lot of this is changing with database driven seach and conversational search

  10. Indian Movie Tapes Post author


  11. Marketing Digest Post author

    This is the best explanation for this topic that I've watched so far. Thanks for this!

  12. Ben Meuan Post author

    how long does it take to get adsense back if your account has been deactivated from improper activity ?

  13. next done Post author

    Hello Google I'm Shelby and I always get hacked my phone is system settings. Dish network Google web and phone and you tube and I get tired from search on just a LG sunrise phone by straight talk so bear with me I didn't hack or steal I'm the victim of identity theft and the thief took my credit social security number and committed huge internet fraud put me a hundred grand in debt thanks for reading from Shelby Rae Anderson Jones 13604668323

  14. enko benko Post author

    If you are searching to obtain higher rank on internet search engine, i suggest using "googleseotips weebly com". Just google it on internet search engine.

  15. HanneLore Marginean Post author

    Are the indexes and the page ranks mirrored between the datacenters around the world or is the search result different depending on your location?

  16. Isabelle Saritch Post author

    Yeap, nice video. I like it : ) When I check pages, i like to use this Google extension

  17. Serah Vurobaravu Post author

    This video has given me solutions to my queries.

  18. Bizentin Post author

    So is this page rank thing the reason when i type in a specific question into google I get results to "popular" websites instead of absolute relevance to the specific question i asked? Google is really going downhill. Do not rank my pages. That's my job

  19. Ruben Spyckerelle Post author

    Thanks you so much for this Really back to basic´s explenation

  20. C&M Digital Media Post author

    Hello Matt. Thanks for the explanation. Really helpfull.

  21. Bartosz Olszewski Post author

    google basically makes copy of the internet. cool.

  22. Captain Nemo Post author

    Thanks for the explaination,
    really helpful

  23. Shivam Sharma Post author

    hey Matt, please can you Explain the Whole HTML code that is behind the google page

  24. MissMiserable Marny Post author

    google search are tottaly irrelevant bullshit!rather go to deepnet to find somphting exept ebay…the pages are hidden deleted ore in the 33 page…the picture search is sad,bec its the biggest search engine and dryng to give u the less amount of info possible….u cant ewen find info about why it sucks so bad bec u cant find it!!!

  25. Trending Line Post author

    Hi matt.can you tell the requirement for work in google

  26. M P Post author

    Many good new companies have no chance to meet your requirements to emerge a market. Tough to be the first page because SEO companies and freelances have know your strategy and feed all the information with different URL and contexts in terms of web board, blogs, classified, social networking, and so on.This is still refer and point to the same firm that hired SEO companies. So, you collect all data they feed and, show the same listing result with different URL. Unavoidable, people have to read and click the same topic on your listing result. It is like a circle. What if a bad company mislead people to use and buy his product by using SEO firms and you show the same listing result? What if…? Please revise your method.

  27. TECHSLO CHANNEL Post author

    I think that you somehow managed to simplify the crawling process of Google's search index. I had no idea it had come so far since the last time I sat in one of these lectures several years ago.

  28. gyro22 Post author

    Google is a big advertiser. They direct the flow to adds that pay the most.

  29. Prodix Studio Post author

    hey! i'm new in the world of internet and have search many times on google how google get websites data of old and new created websites but i don't found better results for this .
    and i want to know about it i want to know that How google get data in his database automatically or you hire people to put the websites data into your database?
    if automatically then how please answer me please i am waiting.

  30. Niki P. Post author

    Hi, I want to know if you can help me with a seminar homework that I have to do, and I can't think of anything on the topic "Can Google find all the information you are looking for?". Please I really need your help !!! 🙁 :((

  31. Andika Al-khalifah Post author

    This vidio is usefull for me, and other vidio's.
    I hope every vidio have translate in indonesia, thank for google webmaster.

  32. Tonny Milfiger Post author

    they say they download the whole web data now who gave the access to all webdata? on which platform, what was the coding prototype to so. i mean lets say my website have a host and my website is on that host and my website have a domain , if you type my domain you can access to all my information so how its work they scan al domains? how did they found the data?

  33. HARI PRASATH Post author


  34. Stephanie Ballard Post author

    I`ve got something you must know fellas. I’ve got a new business utilizing “zimo unique plan” (Google it) which was shown to me by my good friend where he earned lots of money. This system has given me outstanding results. In case you still have no idea concerning this, you should be informed at the moment…

  35. Wael Chorfan Post author

    I really had that feeling before watching this trick explained

  36. Wanda Brown Post author

    Ì have a list on Search You Tube and I don't know how to remove the list that I put on. It please help me.

  37. Erappa Bavigadda Post author

    its too boring and tedious so can u use simple words becos I am beginner

  38. ARPATech PR Post author

    Very helpful information regarding Google ranking and website evaluation process, you have amazingly described the content. Great video!

  39. True Tech Post author

    HI Dear
    Matt, could you please explain letent sementic indexing

  40. Zes Post author

    wrg, no such thing as can or even can talx xxx, talk/can talk any by any n omatter what and it can all b perfx. ts not interes or uninteres. no such thing as strain youtube or not, can talk anyx nmw

  41. Barath Kumar Post author

    Hi Matt, could you please explain how Google's alerts system works? is it the same logic as the Google Search algorithm or is it some other logic. And on what basis a person is getting notified is it the page ranking algorithm or some other triggering technique?. This is an amazing video, Thanks for sharing it.

  42. UFO SecurITy Cam Post author

    i managed to rank our ufo forum number 1 within 3 months in 2018 using simple methods. contact me about 99/month seo service guaranteed ranking or your money back

  43. DS Post author

    @Google_Webmasters Hello Mr Matt Cutts what can you say about a no content articles which have already first rank on some good keywords .?.??
    Please can you search for these keywords "cours management et leadership" … please can you check the first result ? i wanna believe what are you saying on this video but .. such results i find everyday do not say same things as you showing us here. So i please you to make video about such results .. you can check there is more than 80 links on that website which reach first ranks in Google . Thanks for everything Mr Cutts <3

  44. jakir rain Post author

    but for the searching the billion of pattern it takes more second, the result is display out in a millisecond how it possible.

  45. Prateek Vyas Post author

    do google use OCR technology in images to search for something

  46. wildpolo Post author

    but its not enough to understand google 🙂

  47. E E Post author

    Hey, nicely done. No trouble holding my attention, informative, interesting, motivating. Answers big questions about the new way we live and learn. Maybe future shock won't be so bad.

  48. Numan Inam Post author

    Is auto blogging wordpress through RSS effects SEO badly
    Please answer

  49. Ricardo Santos Post author

    Reputable or Popular. Not the same thing.

    Popular is easy to quantify, but how do you quantify reputable?

    For example, the CNN of the world and the New York Times of the world, both have being caught pushing false narratives on daily basis. How exactly are they reputable?

    Furthermore a source "reputable" on one area may not be "reputable" on other areas.

    I wouldn't trust a "Scientific American" to give me advice on law. I would trust Cornell. But I wouldn't trust Cornell to give me advice or science.

    Reputable is a subjective term. As it involves who you personally trust.

    Thus what is reputable for you might not be reputable for me. So my guess is whatever your company ideology tells is "reputable" is what your company put as "reputable" in the algorithm.

    Which would explain your company "American Inventors" search results and "Idiot" search results. Your sources are biased on what you determined what is "reputable" and when your "reputable" sources lack integrity, your company algorithm suffers from it.

  50. Anitha Ajaien Post author

    What should I have to do to show my website in Google search results…

  51. corporate headquarters Post author

    Hi Matt could you please explain how google manage the google maps business listing I have my business listed very low and if I click on the ranked filter my business jumps to the top.

  52. Mayukh Mukhopadhyay Post author

    now i understand what IDF (inverse document frequency) is all about…

  53. Taghreed Aboutaleb Post author

    your videos are really great thanks a lot for your time.

  54. BTraine Ztraine Post author

    At what point in a process do you scrub all the negative content on Hillary email scandal?

  55. Micah Post author

    How does Google Search work … it doesn't. It is obvious that Google ranks the sensational, dramatic, popular, and advertisers that give them $$$ ahead of what is actually relevant. Type in [any large city] weather and you will often get non-weather related dramatic news articles featuring who was shot, went missing, or was killed in a car crash ON THE FIRST PAGE. That has no business being on the 1st page since it is not relevant and has nothing to do with weather. When I was looking for health insurance, I typed in Health Insurance Marketplace and the actual government website came up FIFTH.

    0:59 is laughably wrong, because Duck Duck Go has Google beat.

  56. Nan Deng Post author

    I wish to learn more about index & push data out phase near 2:00

  57. Versatile Knowledge Post author

    This is like, Hey Google !! Tell me something about Yourself 🤣

  58. mariocarnival Post author

    Almost seven years later at this time what have changed?

  59. Nancy R Peters Post author

    I don't know why my print is not working. It was before. I don't know what happen. Also I use to have Google Search. I have a Chrome PC and a HP ENVY 4500 All in one Printer.

  60. Rahman Sadeghi Post author

    but in many best site like searchenginejurnal wrote google don't use LSI skill!!!
    what's your idea about it?

  61. Tinker Treasures Post author

    Wow amazing videos! How am I just now finding this?? Finding out about meta tags saved me HOURS. Thank you so much for all the information.

  62. Christopher Wu Post author

    OK, I guess I will get into Google then learn more about the search engine.

  63. Holyvessel34 Post author

    Maybe I need to figure out how a record player work first before I try to understand a Google lol. This is too much for me lol

  64. gouri panda Post author

    Tell me everything about google?😊😊😊😊😊

  65. Rohit Maindoliya Post author

    Could anyone tell me how much links google crawl for one query

  66. Charlie Gardner Post author

    Just a tid bit for yall: If you live with depressed/suicidal people, then this algorithm will severely negatively impact the mental health of anyone who uses the same ssid (ip). Depressed people get a depressed internet with nothing but depressing recommendations to make you even more depressed. But google says it's better, so I can't argue with that.

  67. Prince Malik Post author

    Amazing! Got my work done! Extremely workable video, if one pays attention.


Leave a Reply

Your email address will not be published. Required fields are marked *