b2evolution

Multilingual multiuser multiblog engine

b2evolution Technical Documentation (CVS HEAD) [ class tree: conf ] [ index: conf ] [ all elements ]

Procedural File: _stats.php

Source Location: /blogs/conf/_stats.php

Page Details

This is b2evolution's stats config file.

Deprecated:  

TODO: It holds now just things that should be move around due to hitlog refactoring.

This file sets how b2evolution will log hits and stats Last significant changes to this file: version 1.6

Filesource:  Source Code for this file
Globals
array   $known_search_params [line 178]

Search params needed to extract keywords from a search engine referer url

Typically http://google.com?s=keyphraz returns keyphraz

fp> TODO: merge with above table dh> Piwik (open source web analytics) might have good data to use for this (referrer/search param pairs) dh> Also useful/required: default encoding of search_params values (e.g. latin1 for suche.t-online.de) fp> TODO: put into configurable database table

Default value:  array(
   'q',
   'as_q',            'as_epq',          'query',
   'search',
   's',               'p',
   'kw',
   'qs',
   'searchfor',       'r',
   'rdata',           'string',          'su',              'Gw',              'text',            'search_query',      'wd',                     'keywords',         )


[ Top ]

array   $search_engines [line 102]

Search engines for statistics

The following substrings will be looked up in the referer http header in order to identify search engines

Default value:  array(
   '//www.google.',      '//images.google.',
    '//video.google.',
   '.hotbot.',
   '.altavista.',
   '.excite.',
   '.voila.fr/',
   'http://search',
   '://suche.',
   'search.',
   'search2.',
   'http://recherche',
   'recherche.',
   'recherches.',
   'vachercher.',
   'feedster.com/',
   'alltheweb.com/',
   'daypop.com/',
   'feedster.com/',
   'technorati.com/',
   'weblogs.com/',
   'exalead.com/',
   'killou.com/',
   'buscador.terra.es',
   'web.toile.com',
   'metacrawler.com/',
   '.mamma.com/',
   '.dogpile.com/',
   'search1-1.free.fr',
   'search1-2.free.fr',
   'overture.com',
   'startium.com',
   '2020search.com',
   'bestsearchonearth.info',
   'mysearch.com',
   'popdex.com',
   '64.233.167.104',
   'seek.3721.com',
   'http://netscape.',
   'http://www.netscape.',
   '/searchresults/',
   '/websearch?',
   'http://results.',
   'baidu.com/',
   'reacteur.com/',
   'http://www.lmi.fr/',
   'kartoo.com/',
   'icq.com/search',
   'alexa.com/search',
   'att.net/s/',    'blingo.com/search',     'crawler.com/search/',      'inbox.com/search/',    'scroogle.org/',    'cuil.com/',
   'yandex.ru/yandsearch',
   'go.mail.ru/search',
   '//www.bing.com',    '//cc.bingj.com',    '.qip.ru/', )

Information Tags:
Todo:  fp> merge definitions for search engine sig + keyword param + position param into a single array
Todo:  fp> move to admin interface (specific list editor), include query params fp Actually, I'm not sure a DB table would be a good idea. It would slow down lookup and it's not really a table the average user is going to modify anyway... dh> It would be a cheap query, and the processing _might_ be even faster, if we can ask MySQL to do the lookup, instead of iterating the same list always. Also, this is a good thing for autoupdates. You're right about average users changing it though.
Todo:  fp> have regexps, for example for //www\.google\.[^/]+/search

[ Top ]

array   $user_agents [line 211]

UserAgent identifiers for logging/statistics

The following substrings will be looked up in the user_agent http header

'type' aggregator currently gets only used to "translate" user agent strings. An aggregator hit gets detected by accessing the feed.

Default value:  array(
      array('robot', 'Googlebot', 'Google (Googlebot)' ),array('robot','Slurp/','Inktomi (Slurp)'),array('robot','Yahoo! Slurp;','Yahoo (Slurp)'),array('robot','msnbot/','MSN Search (msnbot)'),array('robot','Frontier/','Userland (Frontier)'),array('robot','ping.blo.gs/','blo.gs'),array('robot','organica/','Organica'),array('robot','Blogosphere/','Blogosphere'),array('robot','blogging ecosystem crawler','Blogging ecosystem'),array('robot','FAST-WebCrawler/','Fast'),array('robot','timboBot/','Breaking Blogs (timboBot)'),array('robot','NITLE Blog Spider/','NITLE'),array('robot','The World as a Blog ','The World as a Blog'),array('robot','daypopbot/ ','DayPop'),array('robot','Bitacle bot/','Bitacle'),array('robot','Sphere Scout','Sphere Scout'),array('robot','Gigabot/','Gigablast (Gigabot)'),array('robot','Yandex/','Yandex'),array('robot','Mail.Ru/','Mail.Ru'),array('robot','Baiduspider','Baidu spider'),array('robot','infometrics-bot','Infometrics Bot'),array('robot','DotBot/','DotBot'),array('robot','Twiceler-','Cuil (Twiceler)'),array('robot','discobot/','Discovery Engine'),array('robot','Speedy Spider','Entireweb (Speedy Spider)'),array('robot','monit/','Monit'),array('robot','Sogou web spider','Sogou'),array('robot','Tagoobot/','Tagoobot'),array('robot','MJ12bot/','Majestic-12'),array('robot','ia_archiver','Alexa crawler'),array('robot','KaloogaBot','Kalooga'),array('robot','Flexum/','Flexum'),array('robot','OOZBOT/','OOZBOT'),array('robot','ApptusBot','Apptus'),array('robot','psycheclone','Psycheclone'),array('aggregator','AppleSyndication/','Safari RSS (AppleSyndication)'),array('aggregator','Feedreader','Feedreader'),array('aggregator','Syndirella/','Syndirella'),array('aggregator','rssSearch Harvester/','rssSearch Harvester'),array('aggregator','Newz Crawler','Newz Crawler'),array('aggregator','MagpieRSS/','Magpie RSS'),array('aggregator','CoologFeedSpider','CoologFeedSpider'),array('aggregator','Pompos/','Pompos'),array('aggregator','SharpReader/','SharpReader'),array('aggregator','Straw ','Straw'),array('aggregator','YandexBlog','YandexBlog'),array('aggregator',' Planet/','Planet Feed Reader'),array('aggregator','UniversalFeedParser/','Universal Feed Parser'),)


[ Top ]



Documentation generated on Sat, 06 Mar 2010 04:23:49 +0100 by phpDocumentor 1.4.2. This site is hosted and maintained by Daniel HAHLER (Contact).