Crawler4j

  • For a Public crawler
  • Called Crawler4j (GitHub)
  • From The United States
  • By GitHub
  • Gets a score of 36%

What is Crawler4j?

Crawler4j is an open source web crawler for Java, available from GitHub. This means everyone can download and employ the crawler. GitHub describes the crawler as follows: crawler4j is an open source web crawler for Java which provides a simple interface for crawling the Web. Using it, you can setup a multi-threaded web crawler in few minutes. Since everyone can use it, it is impossible to know what is done with the data.


BotRank for Crawler4j (36%):

The Internet of Bots has evaluated Crawler4j against 50 different checkpoints, of which 18 have been confirmed as being positive. These checkpoints evaluate the transparency and occurrence of a bot and don´t necessarily say something about its quality. The BotRank is calculated as [18*2=] 36%. The details of the bot and the BotRank can be seen below. For more information on how the BotRank is made, you can visit the page Botrank.

User Agent(s) (2/5):

  1. Distinguishable: Yes
  2. Botname: Crawler4j
  3. Email: Not mentioned
  4. Version: Not mentioned
  5. Mozilla: Not mentioned

Whois (5/5):

  1. Public: Visit Whois
  2. Organization: GitHub
  3. Country: The United States
  4. City: San Francisco
  5. Street: Yes

Weblinks (2/5):

  1. User agent: Visit Crawler4j (GitHub)
  2. Crawler: Visit Crawler4j (GitHub)
  3. Homepage: Not available
  4. Query: Not available
  5. Adding: Not available

Usage (3/5):

  1. Recommended: Yes
  2. Category: Public crawler
  3. Free query: No
  4. Register: No
  5. Logo: Yes

Occurrence (3/15):

  1. Has visited during 3 of the 15 control months

Webdepth (3/15):

  1. Has visited 3 of the 15 control sites