Hello, I’m Danilo Petrozzi and today I’ll explain
why SEO professionals use Scrapebox but not to spam, as you might think.
When you hear of Scrapebox, the first term that comes to mind is definitely
spam. This is because in the past this software has been used mainly for this purpose,
because one of its many features is one that allows you to post comments
mass in a potentially infinite number sites, and then to obtain backlink follow
or nofollow thanks to CMS like WordPress, Joomla, Drupal and so on.
That said, in this video, I will try to explain what Scrapebox, what it does and what
uses are a little more advanced most people do not know.
I for example I bought absolutely not for the comment spam, which is one of
things that I never served because I know that does not work, then simply
it would be useless to do so. This is the site of Scrapebox. Scrapebox
is nothing but a software that is downloaded, purchasing, downloading, installing. We can
be a license to the computer, then the license is not linked to an email address
or to the person, but is linked to the individual computer. So in order to use Scrapebox on two computers
several also inside the house, need to two licenses.
The license costs $ 97 but in description I put a link to a page that they put
a little hidden to buy it at 57 dollars.
Once purchased the license payment is unique, one-off, then it will cost $ 57
once and then the software will be unlocked to life. And the best part is that it is updated
very often. I often find updates
software, the addon that can be added, then we shall see.
But meanwhile, let’s look at the site. The site is, oh well, a bedsheet classic that inspires
little confidence, it must be said, however, the software I tried it, then, I tell you already that is
good. And let’s see what the site says in short of what
Scrapebox can do. No appointment almost never being able to comment spam, so
already understand the intent was not to be the creator of the software.
Here tells us there is a harvester that is trainable that can be instructed to perform
operations, will manage the proxy that is one of the features that I appreciate
more. Management of the lists, table mode, the auto-comment
obviously with WordPress, or the various engine and CMS that exist.
The proxy verifier with which can be verified proxies, their level of anonymity,
that work on Google SERP, then we’ll see how.
RSS submissions, trackback, review and then payment.
Before you even begin to explain all capabilities Scrapebox I do a
clarification regarding the proxy. You could very well use with Scrapebox
Your IP home, so without any protection, no filter between what you do and free.
This is not recommended, not so much for reasons Legal because in fact most
the things you do with Scrapebox is legal, is legitimate, you are not violating any laws.
The problem is that if you use your IP always to make such harvesting on Google,
ie to take many URL with various keyword, at some point Google could
ban your IP. Simply said, you spam’re doing, you’re making me scraping
of my SERP, I banno your IP so you you will not see the search results.
There are lots of features Scrapebox which is not Google, but in
which still use your IP home or office is simply not advisable
why would you anyway, when would you come banned you should unplug the router and plug it in again,
if you have a dynamic IP to change IP. Then you prefer to always use the proxy.
Scrapebox has an internal function to seek new free public proxy. And this
we will see, but also in this case, may have their uses, they can be useful,
the problem is that if they are free and public for you are free to the public and other thousands
of people using Scrapebox. And then the proxy that are in a given
when they are active and performing, after 5 minutes may have been banned everywhere,
or it may take a few seconds and the server that handles these proxy explodes
because thousands of spammers have used it, they used that particular proxy to do
the worse things. And then there could be problems.
To solve this problem the majority Most people who use Scrapebox so
intensive and professional buy proxies private.
Proxies are simply private addresses IP, the proxy that you purchase and you
exclusive use. So by purchasing 10 private proxy, for example, you have the guarantee
that those 10 proxies are used only and exclusively from you and then you will have maximum performance,
will be active 24 hours on 24, hopefully, if the service is good.
But obviously it’s something that you pay. The world standard price of private proxy
is one proxy costs $ 1 per month. This means that 50 buy private proxy
generally costs $ 50 a month. I I tried about 3-4 proxy private providers,
so I have a great knowledge on the subject, But if you want a first point from which
I can recommend this site, which is instanproxies.com. Here if we go on pricing I show
fact that the price is fixed. So you see that 25 private proxy, oh well the characteristics
is fixed for all 25 dollars. 100 proxy private, $ 100, and so on.
As I said in this video will not talk in any way of comment spam, so
all the functionality that allows Scrapebox to spam within the sites.
And then we go to see the first major functionality so I met this
software that is the harvester. Harvester allows you to do the scraping
Google SERP means that I can enter all the keywords I want and also operators
Google, site, inurl, intitle, and so on, mix them as I want and then go Scrapebox
to do this research within Google through the proxy, if I’ve entered, take
all the results and I will place them inside software.
Obviously uses dell’harvester are endless because you may have the need
to search all the URLs for a given bad for various reasons, because
you are interested in a specific area, because you have discovered a pattern in the
URL to do a certain thing and you need to find all URLs within the index
Google that meet that pattern or want to search all the YouTube videos
have a particular title. The uses are many so today we try
to do some fairly generic examples but of course you know that can be made
much more advanced. For example, let’s look inside
Google simply the keyword “spaghetti cheese and pepper. “I put it in quotes so
is exact match, is searched for the exact key, and I start harvesting.
Ok, ended. And as you can see, this ok the close which is the report of the scan.
In this box, I have entered all Scrapebox how many URLs that he identified within
SERP containing the keyword spaghetti cheese and pepper. Barilla see, in short, Yellow
Saffron I have seen with one eye, Groupon, TripAdvisor, and so on.
Now let’s try to do something a little bit more advanced, for example, we have made to the case,
Now let’s clear the list with this button. Let’s say that I want to find
all URLs in Google’s index that have this keyword in the title and not in
other parts, then put intitle two points, spaghetti with cheese and pepper. But we
I am also interested in all pages who in the title the word “lasagna to
tomato “, also let’s put it this double quotes, and we leave.
Obviously when Google when Scrapebox does this using proxies
I put in a rotation and then use the first for the first bad, the second
for the second keyword, then I make if I another scan will start from the third proxy
and so on until it has reached the end and then start again from the first, in order to optimize
better performance and do not in any best possible of being banned by Google
during this operation. Ok, as you see Scrapebox ended in me
found all the URLs for these two keywords. We see, for example, we want all
pages that have the title “spaghetti cheese and pepper “but they are the Blogspot blog
then insert two points blogspot.com site. Instead regard lasagna with tomato,
yes I want to find all URLs where have this keyword in the title, but I just want
find the YouTube video and then put site youtube.com two points, we clean and we do
restart. Here is the result dell’harvesting and
example, I take some results randomly so we can see them.
As you can see these are all those URLs of sites that supposedly have in
the title keyword spaghetti with cheese and pepper that are Blogspot blog. Now I
I did see the fly a couple of research Scrapebox and has performed with the settings
default. The harvester can be ovviamnete customized lot.
Now if you look at the box here I have selected only to make harvesting on Google because
is the search engine that interests me, but wanting Scrapebox can search results
with the keywords that we put in the box in top left, even inside Yahoo, Bing
or AOL. In connection with Google can tell you also
what is the search engine starting, I now left Google.com for speed
but it might be or if they are Google.it Google will get interested in Vietnam
Google Vietnam so I will have the results only of that search engine.
I can limit the search URL to various areas of universal search, and video,
news, blogs. And I can tell you for example the date on which the URLs have been indexed.
So if I want to try a lot of results recent, about a news or something
I can go in this space and select past 24 hours, so the results of the last
24 hours, or the last week, month, year. After finishing the first phase we do
the harvesting of the URL so we got a list of URLs that we need for a given
reason, Scrapebox provides us a large number of tools to be able to act on this
list. Obviously we can do a simply
export here, we can export it as a text, HTML, as an Excel file, as XML sitemap, can
be very useful, we can simply copy and paste as I did before the
list. We can obviously do the import, if for
If we need to import from outside a list of URLs. Now let’s see what we can
do on this list. For example in the first Remove button / Filter: we can eliminate all
duplicate URLs from this list because that can happen if you try lots
keyword may happen that the same URL is caught two or more times by
Scrapebox. We can remove duplicate domains, and then even if the URLs are different I
I might need to keep a single URL as part of a domain.
If I’m looking for and I want lots of recipes that there is only a Saffron Yellow,
although in reality there are many, I can click this and he will leave me
the first in alphabetical order. I can choose to remove all URLs
that contain a specific keyword. If I now I realized that there are many,
for example, URLs that year, as you see, in the path. I might decide you want to delete
all who have 2010 or 2011 in the URL, we try to do it.
I click on “remove urls containing”, I write Scrapebox 2010 and has removed all URLs
which contained the bad 2010. Or I can tell you remove all URLs
that do not contain a specific keyword, so I want to make an inclusive exclusion
say, that is, keep only those URL that mirror, that contain a specific
keyword. I can do these same operations through
a list that I textual outside with quest’altri two and so on.
Then we see that there is much to root Trim useful because it allows you to do the strip, you
says, that eliminate the entire path of a URL keeping only the domain. If I now
I found this list of URL but different I wanted to keep only the domain could click
“Trim to root” for a list of domains clean.
Can I check Pagerank and then he goes automatically to scan Pagerank
and I can tell him to look for the Pagerank only URL or domain referenced
this URL. There is a check for the index and then clicking
Google Indexed can I check if the URL is present in the index, but since we now
harvesting came from a Google search are 100% sure that all these pages
are indexed in Google but if by chance I did an import from a text file
I might need to use this command. Grab Check serves simply to extrapolate
some generic information within pages such as email addresses or comments.
As for the proxy, as you may have noted, in the lower left is a list
proxy that I’m using because they are Private proxies of my property so that
I use when I have to do the harvesting and then in this box if you would enter your.
But the great thing is that it has a Scrapebox internal functionality that allows you to do
scraping proxy public and freely usable. And so it is a quick way to get
a list of hundreds also very consistent IP addresses of proxies that can be
immediately used to make transactions. So let’s see how you do to find them.
I go on manage, lower left, makes me see a summary of my proxy but I click
harvest of proxies, below. And now he says that Scrapebox lists
anonymous, which does not reveal the name, from which he goes to retrieve various information. I after
several attempts I found that there are lists that on average provide more proxy
good and powerful than others. In this is the last one that I popped the support
source 26 but there are also other, I think the 15, and other. If you want you can select
all Scrapebox and then go on all sites to collect addresses and will test them
in real time but is an operation that would employ so much time and so it is much more convenient
choose very few, 3 or 4 springs, because however, from a source on average could
also bring out 10000 proxy, so much that is to use one. So now I leave
selected 26 and we start. So in this case, at this stage, Scrapebox
is doing is grabbing ie only taking the list of proxies from this source.
In the second stage I show that Scrapebox may test the list just discharged
verify that the proxy is running and anonymous.
As you can see the source 26 has provided Scrapebox to about 11000 proxy. To switch
the next step, namely the verification of their performance and anonymity click
Apply. So Scrapebox took this list
about 11000 proxy, has included in this secondary table and gives me the possibility
to test them with the key test proxies, test all proxies, ie head all proxies.
What is he doing, takes a Scrapebox proxy at a time and try to make a transaction
stupid to open a page and tell us what’s up. If the proxy responds, if not
responds absolutely give me an error 0, if it meets with too much latency will give me
a timeout error, if I as a 404 page of course I will say error 404.
But if I go down more as you see there are proxies that actually work. For example
the first tells us that this is Venezuelan, another Venezuelan, this should be
There “CN”, Chinese, and so on. He tells me that are anonymous.
When there is a Y in this column means that are anonymous, ie not trace the IP
of origin of the connection, then the our.
I also by the speed with which the proxy answers, which is a determining factor especially
when doing mass harvesting. And then I say speed fast or slow, or
medium. In this case has already found, see here
under 180 proxy, then I would say that I can still stoppalo in advance because they are many and
then to extrapolate from this list only those who verified I go on filter,
keep anonymous proxies, ie keep only anonymous proxies, and so I keep a list
186 of the proxy that is able to find that are functional, anonymous and fast enough
in this instant. So we’re not talking about websites
that give you lists of proxies every day, maybe you give them at midnight everyday
and then just passing hour and are already unusable, obviously. In this case
can do harvesting and test proxy real time, at any time you need them,
and it is really tons. Obviously this instrument, namely the fact
can make scraping of proxy and to be able test may also be useful for reasons
going beyond Scrapebox and then I could do a search of the proxy through Scrapebox
and use these proxy for other reasons, with other software, for other purposes in the field
computer, there forbids anyone. The tool is so powerful and so
useful that is very useful in many situations different, helped me many times.
Ok, once we have the list of proxy we want to use simply do
Save and we can save them to a text file, but I tell saves proxy within
Scrapebox and then I will replace them in the box at the bottom left. These here now
are proxies that we did the analysis a little while ago.
So far we have spoken of the characteristics internal Scrapebox ie functionality
that have been developed by the creator of software but the beauty of this program is
there are lots of addon downloaded directly by the software, and then let’s see how
it works. I’m going on addon, I click show available addons,
of course download the latest list of addon that I can install and shows me
a list of all the features Additional we can install inside
my Scrapebox. You see that I have some already installed, others could install
but I have not installed because it does not serve me. So now let’s see, so we say
the fly addon that I found most useful for use with Scrapebox.
Ok let’s start with the first addon simple. As you can see I did a very simple search,
for “personal loan”, he extrapolated 428 URL so a little surprisingly, quite a keyword
famous. Let’s see, is called Alive check from
this screen that I open I can upload all URLs that I found with the harvester,
as you see, is very practical. I do not exports on a txt and then reload, etc.
etcetera. I took over all URLs. The I start.
What makes this tool? Simple, open every URL and checks that the site is alive, alive,
and that does not respond with a 404, 500, that does not time out. It then allows me
to extrapolate from a list of addresses, URL, only those actually active
and achievable. In this case, as you see it went pretty
well because of 428 sites 420 are reachable, and of course I could do the export only
of those alive or dead ones, of those dead or live ones.
Ok, on to the addon, I’ll explain a little bit on the fly. For example, the Alexa rank that
allows me to extrapolate the rank from Alexa a list of URLs fairly quickly,
like the Pagerank and other values that can identify Scrapebox. Then we see
there backlink checker. This I will not I show simply because I can not
not, for now can not work because you must enter the API, also free,
now when I have not, Moz, because this uses, extrapolates the backlink
that link to a list of URLs that we give as input and then allows you to export them as
txt, csv, and so on. However, in order to do this scan requires the API, also
free willing, Moz. Free ones, however, have limitations on
amount of links that can be scanned in a certain time period, so it is a
addon that I use just because obviously there are other tools like Majestic and the like that
are much more efficient. A very nice thing that I show you,
however, is the dofollow test. In this case, do the same as the input
URL before, those identified by the keyword personal loan and through this
screen I can see if inside all these pages there is a link to
a site that interests me and if you follow or nofollow.
So we do not know for example, I had guaranteed by these
sites, for some reasons, that within of these pages there is a link to a particular
site and I want to check if it’s true, but however, this tool can be used
for many different uses, if you think about it. In this case, I have seen a
link that is present in a specific site and then let’s see if it finds me now.
Let’s say I wanted to find if there are links to follow or nofollow fondidigaranzia.it.
I start. And we see. As you can see, after about 1 minute, Scrapebox
tells me there in 13 URL is a backlink, an external link, towards fondidigaranzia.it
and that follow, because otherwise it to me would put in the category nofollow.
So we can see that it is a site institutional, the Ministry of Economy,
that is linked to this page Intesa San Paolo. by BancaMarche.it, Cariparma, BancoDesio,
Unipol Banca and so on. We continue with the addon.
Another addon that helped me many times and is actually very useful is link
extractor, that does not do what the name says, not that extrapolates link, you put them in
a csv, no does something much simpler much more stupid but in some cases it really is
vital because they do such a thing to hand or with other types of software would be
was monstrously impossible, I can not even the adjective.
Then I’ll show you, I load the list dall’harvester, so there is load url list
from Scrapebox harvester. then resumes the famous list first. And I can say,
I can ask him to spot me counting total of all internal or external links present
the page. So I want to know so fast
how many links pointing outside are on a given page, or how many links
pointing to the inside for various reasons, that may serve me.
So I give him, internal and external, and we do go and see how it behaves.
After a few moments we can see the result final and then we see that the page loans
Personal findomestic link has 42 full and 3 external, the Hound has both 38
internal and external, that of Easy has 75 internal and external 4, and so on. We Can
make the export of all of them this information then we can order to see, to put
in order grow or descending one of these two columns to make all kinds of analysis
we need. Proceeding with my list addons installed
I can see that for example the malware and phishing filter is very useful in certain
specific cases because it takes as input a list obviously txt or dall’harvester
and check if they were reported by Google for malware or phishing attempts, then
in the case of sites with holes or when there is need to do a mass control on various sites
potentially at risk, this gives us immediately the information we need. And Then
we see, the page authority, this too I’ll let you see only the fly, requires
API free or for a fee of that Moz should be entered below with the various credentials,
the secret and so on. And then I’d load URL dall’harvester, I would leave and for each URL
I would give the value of the Page Authority, the PA, the famous PA, the DA, Domain Authority,
MozRank and external links. Continuing what I wanted you to see
is the page scanner because this is another tool that I think, like the proxy
checker, the tool that allows you to identify the proxy, in my opinion alone is worth
50 euro with which you pay because Scrapebox if only for that proxy in my opinion worth
even more, even more than 50 EUR alone, because it is really something unique and incredibly
powerful, but also the page scanner is useful. Why, what does the page scanner. The
as we always input the famous list URLs.
I can ask the software to search all pages within a portion
text, or a regex, if I wanted to find, I have in mind a particular piece of information that
may be present in these pages and I want to make sure that there is or not,
what is its content, and so on. I have to search for a specific link, a given
H1, a given structure with which it is put the meta description, rather than the footer,
and so on. I can teach this software to search for specific patterns,
regex, combinations of these things and he will return all of the results in which
is that particular search. And now I do go and I’ll show you.
In this case we see that the software has completed all of the URLs, so they
has been scanned, but only of one, of Segugio.it, writes me asd, simply
for I have called this specific research I needed and then I called asd
I said, asd, in the sense that I found you needed. Because in this case,
if we look at the research that I have set, I set one then
I told the software to open all pages and try one thing that was this keyword:
“Multi-brand comparison”, which I knew to be This page Hound.
And of course in the list, this list footprint, I might add a quantity
endless research that the software will scanned and the performance would not increase
exponentially. Why, what does The software takes a URL, you download the
page, only once, the caches and then having the page source performs
all the different scans which we laid asked that are extremely quick, so
switch 1,2,3, also 100 different scans, is not so problematic because anyway
the main thing is that slows the download of the page.
In addition to the page scanner then others would remain two tools but little useful, for example, the rapid
indexer simply serves to ping in mass a URL to ensure that this is
indexing faster than usual, and then takes as input a list of URLs, you will need to
also give the ping services and software simply spams these ping to do
so that Google and the other bot detect that new page much faster
and then to index the more quickly but in modern times it is no longer useful because
just submit a sitemap, just pinging the sitemap, do operations SEO fairly generic
and easy to obtain the same result. As you can see, apart from the addon I have done
see I’ve already installed there are lots I could install and that might serve you.
The scanner for the doors, removing duplicates that accepts up to 180 million lines, the
checker broken links. Although there are Plugin for WordPress for example they do
this thing, the checker broken links can be very useful. Image grabber for Google.
Social checker imagine the face of a check Facebook, Google+, Twitter, and everyone
social signals. Unicode converter. The extractor Pages in Google cache. Then there
the scraper Articles. There is a scraper for the sitemap. A blog analyzer,
probably analyzes CMS, if there are no comments open, but I already said that I do not care.
Risolver domains. This very useful here serves to give input
a large number of URLs and this addon them transforms in the version Shortner, namely that
short with a service, I guess Tinyurl use, to convert URLs in version
Tiny in seconds. Of the scraper whois. the checker for Pagerak false and so
on. As you have seen I have tried to give you a
floured General features Scrapebox but of course there are lots
I did see that there. There are plugins premium, so some plugins some addon
that you pay. For example there is the rank tracker I think, that makes the automatic tracking
the placement of some keywords, then I guess he should be given in various input keyword,
the URL, the domain that interests us and it automatically scans these things but there are other tools
they do the same thing, of course. Scrapebox also has the resolver for the captcha,
so if you pay for a service that does the resolving the automatic captcha you can insert into
Scrapebox and then will use it when she goes spam or when he goes to post comments, and things
of this type. Scrapebox can be instructed to have blacklist
and whitelist of what and what not to touch, to what to do and what not scraping.
It has lots of options to optimize the speed and the ways in which does the scraping of the SERP.
As I said at the beginning of the video, who actually includes the potential
Software, knows that Scrapebox be used for do anything but link building.
That is, the part at the bottom right, which serves to make the comment spam on blogs or even
other systems, rss, lots of them. Must never be used.
But do not say I know that actually works and in fact I would like to hide this feature
of Scrapebox, because now is famous for this practice, but I invite you to try it
on your sites and tell me how soon disappear by Google. because now is this.
It is impossible to believe that the press of a button and sending a million, two million ten
billion backlinks on sites ranging completely already smashed by thousands of reviews you
can place on Google for bad where There are other people doing instead sites
good, quality content, with backlink editorial and things of this nature, it is unthinkable,
is just stupid. My idea is that those liquid this software
as spam or junk black hat evidently is ignorant, in the most literal sense and clear
of the term, that ignores the functionality of the platform.
Because otherwise I do not see how you can say that this is a fully-software
useless or serves only to those who make or spam to such things. As I said previously
in the video, only the part of the proxy, ie proxy checker and proxy harvester in
Real-time, I could very well be Separate software standalone and a person
could easily pay 50, 100 or even more for euro to license
of that software because it is really phenomenal. Add to this all part dell’harvester
inside of Google SERP with keywords, management proxy, all the addon we
visa to find everything to manage sites ranging in 404, 500, to find internal links,
external, Pagerank, Page Authority, data Moz, the page scanner that accepts not only
keyword within the pages but also regex.
For those who know the potential of the regex, lucky him, I do not chew it a bit, but obviously
a little. Who knows very well the regex has found paradise because you can take
Also a list of millions of pages, insert many footprint, ie lots of regex
that interest or even hundreds and leave Scrapebox in the background while you do other things and
after a while you are getting all of the results you needed. I do not believe. I believe that
no other software that can do this specific thing right in the planet
Earth because it is something absolutely unique. All features that I did see
so far for 50 euro one-off that will guarantee also all future upgrades seconds
me is a quality / price ratio excellent. In the description I leave the link to
Scrapebox buy at $ 57, then 50 euro, because otherwise the full version
normal cost you $ 100 but I’m managed to find a hidden link.
What can I say, I hope you enjoyed the video. Until next time.