<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0">
<channel>
<title><![CDATA[北京SEO_北京SEO培训 - 【元创SEO】]]></title> 
<link>http://www.yuan-chuang.cc/index.php</link> 
<description><![CDATA[元创拥有10多年网络营销和SEO实战经验、管理经验。 《SEO实战 - 核心技术、优化策略、流量提升》一书作者。SEO实战家、推一把联合创始人]]></description> 
<language>zh-cn</language> 
<copyright><![CDATA[北京SEO_北京SEO培训 - 【元创SEO】]]></copyright>
<item>
<link>http://www.yuan-chuang.cc/read.php/.htm</link>
<title><![CDATA[經常爬站的搜尋引擎总结各大搜索引擎的蜘蛛名称]]></title> 
<author> &lt;&gt;</author>
<category><![CDATA[SEO知识库]]></category>
<pubDate>Mon, 09 Feb 2009 03:16:27 +0000</pubDate> 
<guid>http://www.yuan-chuang.cc/read.php/.htm</guid> 
<description>
<![CDATA[ 
	<strong>經常爬站的搜尋引擎总结</strong><br/><br/>Googlebot-Image/1.0 <br/>Mediapartners-Google <br/>Mediapartners-Google/2.1 <br/>msnbot/1.1 (+http://search.msn.com/msnbot.htm) <br/>Sosospider+(+http://help.soso.com/webspider.htm) <br/>Baiduspider+(+http://www.baidu.com/search/spider.htm) <br/>Yanga WorldSearch Bot v1.1/beta (http://www.yanga.co.uk/) <br/>Gaisbot/3.0+(robot06@gais.cs.ccu.edu.tw;+http://gais.cs.ccu.edu.tw/robot.php) <br/>Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) <br/>Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp) <br/>Mozilla/5.0 (compatible; Yahoo! Slurp/3.0; http://help.yahoo.com/help/us/ysearch/slurp) <br/>Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)<br/>這代表什麼意思？大家可以試算一下：<br/><br/> <br/><br/>一個網站上有５萬頁，有１０個不同的搜尋引擎來捉資料，如果擠在一天捉完，那一天就擠進５０萬個需求，但不會有任何產值，還可能拖累主機。而如果在一週內捉完，就算主機沒什麼事，也浪費不少頻寬。或許，這樣的量好像還ＯＫ嘛～～那再想想，如果一台主機上有十個類似的網站呢？耗費這麼多資源在搜尋引擎上面，網站得到什麼？<br/><br/>或許大家會說，不給搜尋引擎來捉，網站怎麼有辦法被找到咧？這點我也認同。但全球搜尋引擎那麼多，每個都來捉，顯然不是對網站最理想的狀態。<br/><br/>建議要有以下的作為：<br/><br/>汱弱留強<br/>網路要有曝光管道，搜尋引擎的途徑不能錯失。但是擇優曝光即可，例如Google, Yahoo等。其他小咖的搜尋引擎，等他作出口碑後再開通未遲。 <br/>逐水草而居<br/>如果搜尋引擎有特別的區域性，例如大陸的知名搜尋引擎，而和網路的目標族群有重疊性，那麼就有必要開放這樣的搜尋引擎。但同樣要汱弱留強。 <br/>層層把關<br/>不成熟的搜尋引擎機器人根本不按 robots.txt 的協定作事，一旦選上網站，就一股腦死命狂捉。所以，要三不五時檢視網站流量記錄，將記錄中的搜尋引擎透過 robots.txt 作第一層的控管。然後應該在網站主機或程式的設定上作第二層的把關，排除不想往來的搜尋引擎，省下資源去服務更多的客戶。 <br/>擴大通路<br/>網站的宣傳通路越多越好，搜尋引擎不可或缺，卻也不是唯一管道。網站應就其定位、服務供應鏈去思考適合的宣傳通路；並且利用時下流行的傳播方式多方宣傳，例如 RSS Feeds、書籤網站、MSN傳播、Email分享、....。<br/><br/><strong>各大搜索引擎的蜘蛛名称</strong><br/>本文记录了全世界比较出名的Robots.txt 列表需要设置的搜索蜘蛛。如何设置那个目录不想被搜索引擎收录的可参照下去设置。<br/>当然也必须从Robots.txt 去设置,此文内容如果你会利用登陆奇兵结合程序,可以帮你带来超大的国外IP流量!<br/>下列为比较出名的搜索引擎蜘蛛名称：<br/>Google的蜘蛛： Googlebot<br/>百度的蜘蛛：baiduspider<br/>Yahoo的蜘蛛：Yahoo Slurp<br/>MSN的蜘蛛：Msnbot<br/>Altavista的蜘蛛：Scooter<br/>Lycos的蜘蛛： Lycos_Spider_(T-Rex) <br/>Alltheweb的蜘蛛： FAST-WebCrawler/ <br/>INKTOMI的蜘蛛： Slurp<br/>如需要参考的可以参照本文：<br/>User-agent（用户代理设置）：(蜘蛛名字)<br/>拒绝：(文件名字) <br/>User-agent: Black Hole <br/>Disallow: / <br/>User-agent: Titan <br/>Disallow: / <br/>User-agent: WebStripper <br/>Disallow: / <br/>User-agent: NetMechanic <br/>Disallow: / <br/>User-agent: CherryPicker <br/>Disallow: / <br/>User-agent: EmailCollector <br/>Disallow: / <br/>User-agent: EmailSiphon <br/>Disallow: / <br/>User-agent: WebBandit <br/>Disallow: / <br/>User-agent: EmailWolf <br/>Disallow: / <br/>User-agent: ExtractorPro <br/>Disallow: / <br/>User-agent: CopyRightCheck <br/>Disallow: / <br/>User-agent: Crescent <br/>Disallow: / <br/>User-agent: NICErsPRO <br/>Disallow: / <br/>User-agent: Wget <br/>Disallow: / <br/>User-agent: SiteSnagger <br/>Disallow: / <br/>User-agent: ProWebWalker <br/>Disallow: / <br/>User-agent: CheeseBot <br/>Disallow: / <br/>User-agent: mozilla/4 <br/>Disallow: / <br/>User-agent: mozilla/5 <br/>Disallow: / <br/>User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows NT) <br/>Disallow: / <br/>User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows 95) <br/>Disallow: / <br/>User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows 9 <br/>Disallow: / <br/>User-agent: ia_archiver <br/>Disallow: / <br/>User-agent: ia_archiver/1.6 <br/>Disallow: / <br/>User-agent: Alexibot <br/>Disallow: / <br/>User-agent: Teleport <br/>Disallow: / <br/>User-agent: TeleportPro <br/>Disallow: / <br/>User-agent: Wget <br/>Disallow: / <br/>User-agent: MIIxpc <br/>Disallow: / <br/>User-agent: Telesoft <br/>Disallow: / <br/>User-agent: Website Quester <br/>Disallow: / <br/>User-agent: WebZip <br/>Disallow: / <br/>User-agent: moget/2.1 <br/>Disallow: / <br/>User-agent: WebZip/4.0 <br/>Disallow: / <br/>User-agent: WebStripper <br/>Disallow: / <br/><br/>----<br/>User-agent: WebSauger <br/>Disallow: / <br/>User-agent: WebCopier <br/>Disallow: / <br/>User-agent: NetAnts <br/>Disallow: / <br/>User-agent: Mister PiX <br/>Disallow: / <br/>User-agent: WebAuto <br/>Disallow: / <br/>User-agent: TheNomad <br/>Disallow: / <br/>User-agent: WWW-Collector-E <br/>Disallow: / <br/>User-agent: RMA <br/>Disallow: / <br/>User-agent: libWeb/clsHTTPDisallow: / <br/>User-agent: asterias <br/>Disallow: / <br/>User-agent: turingos <br/>Disallow: / <br/>User-agent: spanner <br/>Disallow: / <br/>User-agent: InfoNaviRobot <br/>Disallow: / <br/>User-agent: Harvest/1.5 <br/>Disallow: / <br/>User-agent: ExtractorPro <br/>Disallow: / <br/>User-agent: Bullseye/1.0 <br/>Disallow: / <br/>User-agent: Mozilla/4.0 (compatible; BullsEye; Windows 95) <br/>Disallow: / <br/>User-agent: Crescent Internet ToolPak HTTPOLE Control v.1.0 <br/>Disallow: / <br/>User-agent: CherryPickerSE/1.0 <br/>Disallow: / <br/>User-agent: CherryPickerElite/1.0 <br/>Disallow: / <br/>User-agent: WebBandit/3.50 <br/>Disallow: / <br/>User-agent: NICErsPRO <br/>Disallow: / <br/>User-agent: Microsoft URL Control - 5.01.4511 <br/>Disallow: / <br/>User-agent: DittoSpyder <br/>Disallow: / <br/>User-agent: Foobot <br/>Disallow: / <br/>User-agent: WebmasterWorldForumBot <br/>Disallow: / <br/>User-agent: SpankBot <br/>Disallow: / <br/>User-agent: BotALot <br/>Disallow: / <br/>User-agent: lwp-trivial/1.34 <br/>Disallow: / <br/>User-agent: lwp-trivial <br/>Disallow: / <br/>User-agent: BunnySlippers <br/>Disallow: / <br/>User-agent: Microsoft URL Control - 6.00.8169 <br/>Disallow: / <br/>User-agent: URLy Warning <br/>Disallow: / <br/>User-agent: Wget <br/>Disallow: / <br/>User-agent: Wget/1.5.3 <br/>Disallow: / <br/>User-agent: LinkWalker <br/>Disallow: / <br/>User-agent: cosmos <br/>Disallow: / <br/>User-agent: moget <br/>Disallow: / <br/>User-agent: hloader <br/>Disallow: / <br/>User-agent: humanlinks <br/>Disallow: / <br/>User-agent: LinkextractorPro <br/>Disallow: / <br/>User-agent: Offline Explorer <br/>Disallow: / <br/>User-agent: Mata Hari <br/>Disallow: / <br/>User-agent: LexiBot <br/>Disallow: / <br/>User-agent: Offline Explorer <br/>Disallow: / <br/>User-agent: Web Image Collector <br/>Disallow: / <br/>User-agent: The Intraformant <br/>Disallow: / <br/>User-agent: True_Robot/1.0 <br/>Disallow: / <br/>User-agent: True_Robot <br/>Disallow: / <br/>User-agent: BlowFish/1.0 <br/>Disallow: / <br/>User-agent: JennyBot <br/>Disallow: / <br/>User-agent: MIIxpc/4.2 <br/>Disallow: / <br/>User-agent: BuiltBotTough <br/>Disallow: / <br/>User-agent: ProPowerBot/2.14 <br/>Disallow: / <br/>User-agent: BackDoorBot/1.0 <br/>Disallow: / <br/>User-agent: toCrawl/UrlDispatcher <br/>Disallow: / <br/>User-agent: WebEnhancer <br/>Disallow: / <br/>User-agent: TightTwatBot <br/>Disallow: / <br/>User-agent: suzuran <br/>Disallow: / <br/>User-agent: VCI WebViewer VCI WebViewer Win32 <br/>Disallow: / <br/>User-agent: VCI <br/>Disallow: / <br/>User-agent: Szukacz/1.4 <br/>Disallow: / <br/>User-agent: QueryN Metasearch <br/>Disallow: / <br/>User-agent: Openfind data gathere <br/>Disallow: / <br/>User-agent: Openfind <br/>Disallow: / <br/>User-agent: Xenu's Link Sleuth 1.1c <br/>Disallow: / <br/>User-agent: Xenu's <br/>Disallow: / <br/>User-agent: Zeus <br/>Disallow: / <br/>User-agent: RepoMonkey Bait & Tackle/v1.01 <br/>Disallow: / <br/>User-agent: RepoMonkey <br/>Disallow: / <br/>User-agent: Zeus 32297 Webster Pro V2.9 Win32 <br/>Disallow: / <br/>User-agent: Webster Pro <br/>Disallow: / <br/>User-agent: EroCrawler <br/>Disallow: / <br/>User-agent: LinkScan/8.1a Unix Disallow: / <br/>User-agent: Kenjin Spider <br/>Disallow: / <br/>User-agent: Keyword Density/0.9 <br/>Disallow: / <br/>User-agent: Kenjin Spider <br/>Disallow: / <br/>User-agent: Cegbfeieh <br/>Disallow: / <br/>
]]>
</description>
</item><item>
<link>http://www.yuan-chuang.cc/read.php/.htm#blogcomment</link>
<title><![CDATA[[评论] 經常爬站的搜尋引擎总结各大搜索引擎的蜘蛛名称]]></title> 
<author> &lt;user@domain.com&gt;</author>
<category><![CDATA[评论]]></category>
<pubDate>Thu, 01 Jan 1970 00:00:00 +0000</pubDate> 
<guid>http://www.yuan-chuang.cc/read.php/.htm#blogcomment</guid> 
<description>
<![CDATA[ 
	
]]>
</description>
</item>
</channel>
</rss>