Php photo gallery TWG | JFUploader | TWG Flash upload | WFU | Forum

Get help for TinyWebGallery, the best image gallery. The forum is also home for the Joomla JFUploader, TWG Flash Uploader and the Wordpress flash uploader.
It is currently 15. Dec 2018, 12:05

All times are UTC + 1 hour [ DST ]




Post new topic Reply to topic  [ 6 posts ] 
Author Message
PostPosted: 9. Jan 2014, 20:28 
Offline

Joined: 3. Jan 2014, 09:28
Posts: 15
Hi Michael,

I like to allow indexing of my gallery from search engines but without counting them as visitors.

I forbid robots to visit my gallery with robots.txt but baidu starts to ignore this becasue someone starts to link my galleriy from elsewhere. On their webpage baidu mentoins your page is visited to verify the refering link, even its forbidden by robots.txt. Since badu (and others) crawl from many IP Adresses in parallel the counter of my gallery shows up to 100 visitors per day only from baidu :-(

It may be useful for for others to allow indexing of twg galleries by search engines witout counting them.

So I start looking at your code and tried to implement SearchEngine Robots detection based on Useragent.
Even with a more open robots.txt, my daily vistist drop down to the 10+ real visitors per day.

You may find my attached code as an usefull example to implement this feature request.
See next post ...

Kay


Last edited by KayM on 9. Jan 2014, 20:54, edited 2 times in total.

Top
 Profile  
 
 Post subject: Example Code
PostPosted: 9. Jan 2014, 20:34 
Offline

Joined: 3. Jan 2014, 09:28
Posts: 15
Basic code for detecting and not counting robots:

setbrowser.inc.php:
Code:
[...]
$isIpad = false;
$isIphone = false;
$isSEbot = false; // is Serch Engine bot
[...]
    // detecting known bot's and bot's in general
    $isSEbot = strstr($ua, 'Googlebot') || strstr($ua, 'bingbot') || strstr($ua, 'Baiduspider') || strstr($ua, 'Yandex')
        || strstr($ua, 'waybackarchive') || strstr($ua, 'Spider') || strstr($ua, 'spider') || strstr($ua, 'robot')
        || strstr($ua, 'bot.htm') || strstr($ua, 'Crawler') || strstr($ua, 'crawler');
}


setspecials.inc.php:
Code:
if ($isSEbot) { // special setting for search engines
    $enable_counter= false; // don't count search engine bots, this was the main reson for creating this!
}


Top
 Profile  
 
PostPosted: 9. Jan 2014, 20:49 
Offline

Joined: 3. Jan 2014, 09:28
Posts: 15
rigth after my simple code was working for me as expected, I was thinking about showing robots a stripped down page without as many as possible features activated. so I made some guesses about what features to deactivate ... dont trust me in this :-)

setspecials.inc.php
Code:
[...]
if ($isSEbot) { // special setting for search engines
    $enable_counter= false; // don't count search engine bots, this was the main reson for creating this!
    //
    // since we know to have a crawling robot here, try to create simple pages ...
    $enable_language_selector=false;    // crawl only default language
    $show_background_images = false;
    $menu_x = 20; //show as many as possible folders on one page
    $menu_y = 20;
    $thumbnails_x = 20; // show as many as possible thumbnails on one page
    $thumbnails_y = 40;
    $autodetect_maximum_thumbnails = false;
    $show_videos = $low_show_videos = false; // don't show video
    $number_top10 = 1; // not really useful but found not parameter to disable top10
    $enable_download = false
    // disable JS functions for Bots may be a good idea, so
    // add " || $isSEBot" to the twg_noJS section below (if nojs is not already detected)
}
[...]
if (isset($_SESSION['twg_nojs']) || $isSEbot) { // no jacascript or isSEbot - we turn off lots of stuff
    $default_big_navigation = "HTML";
    ...
}
[...]


Top
 Profile  
 
PostPosted: 10. Jan 2014, 17:26 
Offline
Site Admin
User avatar

Joined: 1. Aug 2005, 12:53
Posts: 10955
Hi Kay,

Good idea. But not with
$isSEbot = strstr($ua, 'Googlebot') || strstr($ua, 'bingbot') || strstr($ua, 'Baiduspider') || strstr($ua, 'Yandex')
|| strstr($ua, 'waybackarchive') || strstr($ua, 'Spider') || strstr($ua, 'spider') || strstr($ua, 'robot')
|| strstr($ua, 'bot.htm') || strstr($ua, 'Crawler') || strstr($ua, 'crawler');

I have currently added a browser detection to one of my other projects in php. I'll add this also for TWG. Than I also get a much better mobile detection....


Top
 Profile  
 
PostPosted: 11. Jan 2014, 08:52 
Offline

Joined: 3. Jan 2014, 09:28
Posts: 15
Hi Michael,

I know, the detection code is ugly, but it's working for now :-)
so I'm looking forward to see your implementation.
Thanks!

Kay


Top
 Profile  
 
PostPosted: 11. Jan 2014, 16:49 
Offline
Site Admin
User avatar

Joined: 1. Aug 2005, 12:53
Posts: 10955
It uses the detection file from here: http://browscap.org/
adding support for spiders for my implementation should be no problem...

Best, Michael


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 6 posts ] 

All times are UTC + 1 hour [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
cron
powered by phpbb | Datenschutz/ Privacy policy