PS3Hax - Sony Fears Us  

Go Back   PS3Hax - Sony Fears Us > PS3Hax Corner > General Messages
Register FAQ Members List Calendar Mark Forums Read

General Messages Messages and what not

Reply
 
LinkBack Thread Tools
  #1 (permalink)  
Old 01-18-2007, 09:20 AM
haxen's Avatar
haxen   is offline
I shredded my PS3
PS3Hax Leader
 
Join Date: Dec 2006
Posts: 239
Default Spider IP Addresses

These Bots are allowed in, nothing else


Google (Googlebot)

66.249.65.103
66.249.65.197
66.249.65.236
66.249.66.18
66.249.72.74

MSN (MSN Bot)
65.55.208.107
65.55.208.108
65.55.208.109
65.55.208.110
65.55.208.111
65.55.208.112
65.55.212.75
65.55.213.86

Yahoo (Inktomi Slurp)
74.6.71.83
74.6.86.26
__________________
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 02-19-2007, 03:08 PM
haxen's Avatar
haxen   is offline
I shredded my PS3
PS3Hax Leader
 
Join Date: Dec 2006
Posts: 239
Default Re: Spider IP Addresses

Pattern matching for robots.txt:

Googlebot interprets some pattern matching.

Matching a sequence of characters using *
You can use an asterisk (*) to match a sequence of characters. For instance, to block access to all subdirectories that begin with private, you could use the following entry:

User-Agent: Googlebot
Disallow: /private*/
To block access to all URLs that include a question mark (?), you could use the following entry:

User-agent: *
Disallow: /*?*
Matching the end characters of the URL using $
You can use the $ character to specify matching the end of the URL. For instance, to block an URLs that end with .asp, you could use the following entry:

User-Agent: Googlebot
Disallow: /*.asp$
You can use this pattern matching in combination with the Allow directive. For instance, if a ? indicates a session ID, you may want to exclude all URLs that contain them to ensure Googlebot doesn't crawl duplicate pages. But URLs that end with a ? may be the version of the page that you do want included. For this situation, you can set your robots.txt file as follows:

User-agent: *
Allow: /*?$
Disallow: /*?
The Disallow:/ *? line will block any URL that includes a ? (more specifically, it will block any URL that begins with your domain name, followed by any string, followed by a question mark, followed by any string).

The Allow: /*?$ line will allow any URL that ends in a ? (more specifically, it will allow any URL that begins with your domain name, followed by a string, followed by a ?, with no characters after the ?).
__________________
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 02-19-2007, 08:01 PM
haxen's Avatar
haxen   is offline
I shredded my PS3
PS3Hax Leader
 
Join Date: Dec 2006
Posts: 239
Default Re: Spider IP Addresses

my robots.txt
# for [Please register to see this link. ] created by Haxen

User-agent: Googlebot
Allow: /*action=forum$
Disallow: /*=
Disallow: /*smf_images_url$
Disallow: /*smf_scripturl$
Disallow: /attachments/
Disallow: /avatars/
Disallow: /cgi-bin/
Disallow: /chat7/
Disallow: /FCKeditor/
Disallow: /gallery/
Disallow: /images/
Disallow: /Packages/
Disallow: /rssbox/
Disallow: /seo4smf_icons/
Disallow: /simplepie/
Disallow: /Smileys/
Disallow: /Sources/
Disallow: /Themes/
Disallow: /tmp/
Disallow: /tp-downloads/
Disallow: /tp-images/
Disallow: /wysiwyg/
Disallow: /phpinfo.php
Disallow: /seo4smf-redirect.php
Disallow: /Settings.php
Disallow: /SSI.php


User-agent: *
Disallow: /attachments/
Disallow: /avatars/
Disallow: /cgi-bin/
Disallow: /chat7/
Disallow: /FCKeditor/
Disallow: /gallery/
Disallow: /images/
Disallow: /Packages/
Disallow: /rssbox/
Disallow: /seo4smf_icons/
Disallow: /simplepie/
Disallow: /Smileys/
Disallow: /Sources/
Disallow: /Themes/
Disallow: /tmp/
Disallow: /tp-downloads/
Disallow: /tp-images/
Disallow: /wysiwyg/
Disallow: /phpinfo.php
Disallow: /seo4smf-redirect.php
Disallow: /Settings.php
Disallow: /SSI.php

User-Agent: msnbot
Crawl-Delay: 10

User-Agent: Slurp
Crawl-Delay: 20

#Unsafe robots to keep away

User-agent: Aqua_Products
Disallow: /

User-agent: asterias
Disallow: /

User-agent: b2w/0.1
Disallow: /

User-agent: BackDoorBot/1.0
Disallow: /

User-agent: Black Hole
Disallow: /

User-agent: BlowFish/1.0
Disallow: /

User-agent: Bookmark search tool
Disallow: /

User-agent: BotALot
Disallow: /

User-agent: BuiltBotTough
Disallow: /

User-agent: Bullseye/1.0
Disallow: /

User-agent: BunnySlippers
Disallow: /

User-agent: Cegbfeieh
Disallow: /

User-agent: CheeseBot
Disallow: /

User-agent: CherryPicker
Disallow: /

User-agent: CherryPicker /1.0
Disallow: /

User-agent: CherryPickerElite/1.0
Disallow: /

User-agent: CherryPickerSE/1.0
Disallow: /

User-agent: CopyRightCheck
Disallow: /

User-agent: cosmos
Disallow: /

User-agent: Crescent
Disallow: /

User-agent: Crescent Internet ToolPak HTTP OLE Control v.1.0
Disallow: /

User-agent: DittoSpyder
Disallow: /

User-agent: EmailCollector
Disallow: /

User-agent: EmailSiphon
Disallow: /

User-agent: EmailWolf
Disallow: /

User-agent: EroCrawler
Disallow: /

User-agent: ExtractorPro
Disallow: /

User-agent: FairAd Client
Disallow: /

User-agent: Flaming AttackBot
Disallow: /

User-agent: Foobot
Disallow: /

User-agent: Gaisbot
Disallow: /

User-agent: GetRight/4.2
Disallow: /

User-agent: grub
Disallow: /

User-agent: grub-client
Disallow: /

User-agent: Harvest/1.5
Disallow: /

User-agent: hloader
Disallow: /

User-agent: httplib
Disallow: /

User-agent: humanlinks
Disallow: /

User-agent: ia_archiver
Disallow: /

User-agent: ia_archiver/1.6
Disallow: /

User-agent: InfoNaviRobot
Disallow: /

User-agent: Iron33/1.0.2
Disallow: /

User-agent: JennyBot
Disallow: /

User-agent: Kenjin Spider
Disallow: /

User-agent: Keyword Density/0.9
Disallow: /

User-agent: larbin
Disallow: /

User-agent: LexiBot
Disallow: /

User-agent: libWeb/clsHTTP
Disallow: /

User-agent: LinkextractorPro
Disallow: /

User-agent: LinkScan/8.1a Unix
Disallow: /

User-agent: LinkWalker
Disallow: /

User-agent: LNSpiderguy
Disallow: /

User-agent: lwp-trivial
Disallow: /

User-agent: lwp-trivial/1.34
Disallow: /

User-agent: Mata Hari
Disallow: /

User-agent: Microsoft URL Control
Disallow: /

User-agent: Microsoft URL Control - 5.01.4511
Disallow: /

User-agent: Microsoft URL Control - 6.00.8169
Disallow: /

User-agent: MIIxpc
Disallow: /

User-agent: MIIxpc/4.2
Disallow: /

User-agent: Mister PiX
Disallow: /

User-agent: moget
Disallow: /

User-agent: moget/2.1
Disallow: /

User-agent: mozilla/4
Disallow: /

User-agent: Mozilla/4.0 (compatible; BullsEye; Windows 95)
Disallow: /

User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows 2000)
Disallow: /

User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows 95)
Disallow: /

User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows 9
Disallow: /

User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows ME)
Disallow: /

User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows NT)
Disallow: /

User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows XP)
Disallow: /

User-agent: mozilla/5
Disallow: /

User-agent: MSIECrawler
Disallow: /

User-agent: NetAnts
Disallow: /

User-agent: NetMechanic
Disallow: /

User-agent: NICErsPRO
Disallow: /

User-agent: Offline Explorer
Disallow: /

User-agent: Openbot
Disallow: /

User-agent: Openfind
Disallow: /

User-agent: Openfind data gathere
Disallow: /

User-agent: Oracle Ultra Search
Disallow: /

User-agent: PerMan
Disallow: /

User-agent: ProPowerBot/2.14
Disallow: /

User-agent: ProWebWalker
Disallow: /

User-agent: psbot
Disallow: /

User-agent: Python-urllib
Disallow: /

User-agent: QueryN Metasearch
Disallow: /

User-agent: Radiation Retriever 1.1
Disallow: /

User-agent: RepoMonkey
Disallow: /

User-agent: RepoMonkey Bait & Tackle/v1.01
Disallow: /

User-agent: RMA
Disallow: /

User-agent: searchpreview
Disallow: /

User-agent: SiteSnagger
Disallow: /

User-agent: SpankBot
Disallow: /

User-agent: spanner
Disallow: /

User-agent: suzuran
Disallow: /

User-agent: Szukacz/1.4
Disallow: /

User-agent: Teleport
Disallow: /

User-agent: TeleportPro
Disallow: /

User-agent: Telesoft
Disallow: /

User-agent: The Intraformant
Disallow: /

User-agent: TheNomad
Disallow: /

User-agent: TightTwatBot
Disallow: /

User-agent: Titan
Disallow: /

User-agent: toCrawl/UrlDispatcher
Disallow: /

User-agent: True_Robot
Disallow: /

User-agent: True_Robot/1.0
Disallow: /

User-agent: turingos
Disallow: /

User-agent: URL Control
Disallow: /

User-agent: URL_Spider_Pro
Disallow: /

User-agent: URLy Warning
Disallow: /

User-agent: VCI
Disallow: /

User-agent: VCI WebViewer VCI WebViewer Win32
Disallow: /

User-agent: Web Image Collector
Disallow: /

User-agent: WebAuto
Disallow: /

User-agent: WebBandit
Disallow: /

User-agent: WebBandit/3.50
Disallow: /

User-agent: WebCopier
Disallow: /

User-agent: WebEnhancer
Disallow: /

User-agent: WebmasterWorldForumBot
Disallow: /

User-agent: WebSauger
Disallow: /

User-agent: Website Quester
Disallow: /

User-agent: Webster Pro
Disallow: /

User-agent: WebStripper
Disallow: /

User-agent: WebZip
Disallow: /

User-agent: WebZip/4.0
Disallow: /

User-agent: Wget
Disallow: /

User-agent: Wget/1.5.3
Disallow: /

User-agent: Wget/1.6
Disallow: /

User-agent: WWW-Collector-E
Disallow: /

User-agent: Xenu's
Disallow: /

User-agent: Xenu's Link Sleuth 1.1c
Disallow: /

User-agent: Zeus
Disallow: /

User-agent: Zeus 32297 Webster Pro V2.9 Win32
Disallow: /

User-agent: Zeus Link Scout
Disallow: /
__________________
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is On
Trackbacks are On
Pingbacks are On
Refbacks are On


All times are GMT -5. The time now is 07:39 AM.


Powered by vBulletin® Version 3.6.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by 3.0.0
© PS3Hax.com 2007 & 2008