Php photo gallery TWG | JFUploader | TWG Flash upload | WFU | Forum
https://www.tinywebgallery.com/forum/

don't count visits from serch engines
https://www.tinywebgallery.com/forum/viewtopic.php?f=3&t=3632
Page 1 of 1

Author:  KayM [ 9. Jan 2014, 20:28 ]
Post subject:  don't count visits from serch engines

Hi Michael,

I like to allow indexing of my gallery from search engines but without counting them as visitors.

I forbid robots to visit my gallery with robots.txt but baidu starts to ignore this becasue someone starts to link my galleriy from elsewhere. On their webpage baidu mentoins your page is visited to verify the refering link, even its forbidden by robots.txt. Since badu (and others) crawl from many IP Adresses in parallel the counter of my gallery shows up to 100 visitors per day only from baidu :-(

It may be useful for for others to allow indexing of twg galleries by search engines witout counting them.

So I start looking at your code and tried to implement SearchEngine Robots detection based on Useragent.
Even with a more open robots.txt, my daily vistist drop down to the 10+ real visitors per day.

You may find my attached code as an usefull example to implement this feature request.
See next post ...

Kay

Author:  KayM [ 9. Jan 2014, 20:34 ]
Post subject:  Example Code

Basic code for detecting and not counting robots:

setbrowser.inc.php:
Code:
[...]
$isIpad = false;
$isIphone = false;
$isSEbot = false; // is Serch Engine bot
[...]
    // detecting known bot's and bot's in general
    $isSEbot = strstr($ua, 'Googlebot') || strstr($ua, 'bingbot') || strstr($ua, 'Baiduspider') || strstr($ua, 'Yandex')
        || strstr($ua, 'waybackarchive') || strstr($ua, 'Spider') || strstr($ua, 'spider') || strstr($ua, 'robot')
        || strstr($ua, 'bot.htm') || strstr($ua, 'Crawler') || strstr($ua, 'crawler');
}


setspecials.inc.php:
Code:
if ($isSEbot) { // special setting for search engines
    $enable_counter= false; // don't count search engine bots, this was the main reson for creating this!
}

Author:  KayM [ 9. Jan 2014, 20:49 ]
Post subject:  Present robots a simple page without JS?

rigth after my simple code was working for me as expected, I was thinking about showing robots a stripped down page without as many as possible features activated. so I made some guesses about what features to deactivate ... dont trust me in this :-)

setspecials.inc.php
Code:
[...]
if ($isSEbot) { // special setting for search engines
    $enable_counter= false; // don't count search engine bots, this was the main reson for creating this!
    //
    // since we know to have a crawling robot here, try to create simple pages ...
    $enable_language_selector=false;    // crawl only default language
    $show_background_images = false;
    $menu_x = 20; //show as many as possible folders on one page
    $menu_y = 20;
    $thumbnails_x = 20; // show as many as possible thumbnails on one page
    $thumbnails_y = 40;
    $autodetect_maximum_thumbnails = false;
    $show_videos = $low_show_videos = false; // don't show video
    $number_top10 = 1; // not really useful but found not parameter to disable top10
    $enable_download = false
    // disable JS functions for Bots may be a good idea, so
    // add " || $isSEBot" to the twg_noJS section below (if nojs is not already detected)
}
[...]
if (isset($_SESSION['twg_nojs']) || $isSEbot) { // no jacascript or isSEbot - we turn off lots of stuff
    $default_big_navigation = "HTML";
    ...
}
[...]

Author:  TinyWebGallery [ 10. Jan 2014, 17:26 ]
Post subject:  Re: don't count visits from serch engines

Hi Kay,

Good idea. But not with
$isSEbot = strstr($ua, 'Googlebot') || strstr($ua, 'bingbot') || strstr($ua, 'Baiduspider') || strstr($ua, 'Yandex')
|| strstr($ua, 'waybackarchive') || strstr($ua, 'Spider') || strstr($ua, 'spider') || strstr($ua, 'robot')
|| strstr($ua, 'bot.htm') || strstr($ua, 'Crawler') || strstr($ua, 'crawler');

I have currently added a browser detection to one of my other projects in php. I'll add this also for TWG. Than I also get a much better mobile detection....

Author:  KayM [ 11. Jan 2014, 08:52 ]
Post subject:  Re: don't count visits from serch engines

Hi Michael,

I know, the detection code is ugly, but it's working for now :-)
so I'm looking forward to see your implementation.
Thanks!

Kay

Author:  TinyWebGallery [ 11. Jan 2014, 16:49 ]
Post subject:  Re: don't count visits from serch engines

It uses the detection file from here: http://browscap.org/
adding support for spiders for my implementation should be no problem...

Best, Michael

Page 1 of 1 All times are UTC + 1 hour [ DST ]
Powered by phpBB® Forum Software © phpBB Group
http://www.phpbb.com/