Reply to post • Re: Scraping • The Register Forums

Tuesday 26th October 2021 07:48 GMT andy 103

Re: Scraping

It's quite worrying how poorly educated Reg readers are.

What do you think is being indexed exactly? robots.txt tells search engine crawlers whether or not they should index content - content which is accessible to anybody, i.e. on public web pages - not behind a login or stored in a database that's otherwise inaccessible except for authorised/authenticated users. A Google bot cannot get around a login screen (hint: it doesn't have any credentials to enable it to log in!).

The only way that Google could "scrape" phone numbers - with reference to this story - is if there was a publically accessible web page (or pages) on Facebook which listed out individuals phone numbers. There isn't. To see somebody's phone number you have to be:

1. Logged in

2. Either a connection, or the user has set their phone number to "public", which isn't even the default setting.

In any case (1) still applies and a Google bot cannot index phone numbers on peoples Facebook accounts.

It really does concern me how Reg readers make posts like they know what they're talking about. Go and actually try it if you think otherwise. Google your phone number and see if there's anything on the domain facebook.com for it. (There won't be).

If you're going to be really pedantic about it indexing the names of people's profiles, there's even a setting in Facebook where you can stop search engines indexing your page.

Topics

Special Features

Vendor Voice

Resources

User topics

Article topics

Reply to post: Re: Scraping

Facebook sues scraper who sold 178 million phone numbers and user IDs

Re: Scraping

POST COMMENT House rules

Enter your comment

Add an icon

About Us

Our Websites

Your Privacy