Procedural File: _stats.php
Source Location: /blogs/conf/_stats.php
This is b2evolution's stats config file.
| Deprecated: | TODO: It holds now just things that should be move around due to hitlog refactoring. This file sets how b2evolution will log hits and stats Last significant changes to this file: version 1.6 |
| Filesource: | Source Code for this file |
Search params needed to extract keywords from a search engine referer url
Typically http://google.com?s=keyphraz returns keyphraz
fp> TODO: merge with above table dh> Piwik (open source web analytics) might have good data to use for this (referrer/search param pairs) dh> Also useful/required: default encoding of search_params values (e.g. latin1 for suche.t-online.de) fp> TODO: put into configurable database table
Default value: array('q',
'as_q', 'as_epq', 'query',
'search',
's', 'p',
'kw',
'qs',
'searchfor', 'r',
'rdata', 'string', 'su', 'Gw', 'text', 'search_query', 'wd', 'keywords', )
Search engines for statistics
The following substrings will be looked up in the referer http header in order to identify search engines
Default value: array('//www.google.', '//images.google.',
'//video.google.',
'.hotbot.',
'.altavista.',
'.excite.',
'.voila.fr/',
'http://search',
'://suche.',
'search.',
'search2.',
'http://recherche',
'recherche.',
'recherches.',
'vachercher.',
'feedster.com/',
'alltheweb.com/',
'daypop.com/',
'feedster.com/',
'technorati.com/',
'weblogs.com/',
'exalead.com/',
'killou.com/',
'buscador.terra.es',
'web.toile.com',
'metacrawler.com/',
'.mamma.com/',
'.dogpile.com/',
'search1-1.free.fr',
'search1-2.free.fr',
'overture.com',
'startium.com',
'2020search.com',
'bestsearchonearth.info',
'mysearch.com',
'popdex.com',
'64.233.167.104',
'seek.3721.com',
'http://netscape.',
'http://www.netscape.',
'/searchresults/',
'/websearch?',
'http://results.',
'baidu.com/',
'reacteur.com/',
'http://www.lmi.fr/',
'kartoo.com/',
'icq.com/search',
'alexa.com/search',
'att.net/s/', 'blingo.com/search', 'crawler.com/search/', 'inbox.com/search/', 'scroogle.org/', 'cuil.com/',
'yandex.ru/yandsearch',
'go.mail.ru/search',
'//www.bing.com', '//cc.bingj.com', '.qip.ru/', )
Information Tags:
| Todo: | fp> merge definitions for search engine sig + keyword param + position param into a single array |
| Todo: | fp> move to admin interface (specific list editor), include query params fp Actually, I'm not sure a DB table would be a good idea. It would slow down lookup and it's not really a table the average user is going to modify anyway... dh> It would be a cheap query, and the processing _might_ be even faster, if we can ask MySQL to do the lookup, instead of iterating the same list always. Also, this is a good thing for autoupdates. You're right about average users changing it though. |
| Todo: | fp> have regexps, for example for //www\.google\.[^/]+/search |
UserAgent identifiers for logging/statistics
The following substrings will be looked up in the user_agent http header
'type' aggregator currently gets only used to "translate" user agent strings. An aggregator hit gets detected by accessing the feed.
Default value: array(array('robot', 'Googlebot', 'Google (Googlebot)' ),array('robot','Slurp/','Inktomi (Slurp)'),array('robot','Yahoo! Slurp;','Yahoo (Slurp)'),array('robot','msnbot/','MSN Search (msnbot)'),array('robot','Frontier/','Userland (Frontier)'),array('robot','ping.blo.gs/','blo.gs'),array('robot','organica/','Organica'),array('robot','Blogosphere/','Blogosphere'),array('robot','blogging ecosystem crawler','Blogging ecosystem'),array('robot','FAST-WebCrawler/','Fast'),array('robot','timboBot/','Breaking Blogs (timboBot)'),array('robot','NITLE Blog Spider/','NITLE'),array('robot','The World as a Blog ','The World as a Blog'),array('robot','daypopbot/ ','DayPop'),array('robot','Bitacle bot/','Bitacle'),array('robot','Sphere Scout','Sphere Scout'),array('robot','Gigabot/','Gigablast (Gigabot)'),array('robot','Yandex/','Yandex'),array('robot','Mail.Ru/','Mail.Ru'),array('robot','Baiduspider','Baidu spider'),array('robot','infometrics-bot','Infometrics Bot'),array('robot','DotBot/','DotBot'),array('robot','Twiceler-','Cuil (Twiceler)'),array('robot','discobot/','Discovery Engine'),array('robot','Speedy Spider','Entireweb (Speedy Spider)'),array('robot','monit/','Monit'),array('robot','Sogou web spider','Sogou'),array('robot','Tagoobot/','Tagoobot'),array('robot','MJ12bot/','Majestic-12'),array('robot','ia_archiver','Alexa crawler'),array('robot','KaloogaBot','Kalooga'),array('robot','Flexum/','Flexum'),array('robot','OOZBOT/','OOZBOT'),array('robot','ApptusBot','Apptus'),array('robot','psycheclone','Psycheclone'),array('aggregator','AppleSyndication/','Safari RSS (AppleSyndication)'),array('aggregator','Feedreader','Feedreader'),array('aggregator','Syndirella/','Syndirella'),array('aggregator','rssSearch Harvester/','rssSearch Harvester'),array('aggregator','Newz Crawler','Newz Crawler'),array('aggregator','MagpieRSS/','Magpie RSS'),array('aggregator','CoologFeedSpider','CoologFeedSpider'),array('aggregator','Pompos/','Pompos'),array('aggregator','SharpReader/','SharpReader'),array('aggregator','Straw ','Straw'),array('aggregator','YandexBlog','YandexBlog'),array('aggregator',' Planet/','Planet Feed Reader'),array('aggregator','UniversalFeedParser/','Universal Feed Parser'),)
