Miscellaneous ⇒ Search Engines ⇒ Site is not indexed in google yet ⇒ Community Forums ⇒ CPG Dragonfly™ CMS
Forum IndexSearch Engines

Site is not indexed in google yet Reply to topic


Hi,

My site is not yet indexed in google, but it has been indexed in yahoo and windows live. although i added the url to google, as my experiences on my other sites indexed in google, now i think sufficient time has been passed to index the site in google , but i have a doubt in my robots.txt file, why my is not indexed yet, anyway this the content of my robots.txt. Are they correct or what should be edited in this file to index my site on google.

User-agent: Baidu Disallow: / User-agent: *alexa* Disallow: / User-agent: Googlebot-Image Disallow: / User-agent: Fasterfox Disallow: / User-agent: * Crawl-delay: 20 Disallow: /admin.php Disallow: /error.php Disallow: /admin/ Disallow: /blocks/ Disallow: /cache/ Disallow: /images/ Disallow: /includes/ Disallow: /language/ Disallow: /modules/ Disallow: /themes/

Thank you in advance if somebody help me. Smile

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
Linux/Apache/My SQL/PHP/DF 9.1.2.1


Try this out www.google.com/webmaster Should be of a real help!

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
Linux/2.0.54 (Unix)/5.0.51/5.2.6/DragonflyCMS 9.2.1


ok i'll try, thanks for the information

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
Linux/Apache/My SQL/PHP/DF 9.1.2.1


i tried Google webmaster, but i cannot verify my site at there. i created html file for that and it says that they are experiencing a temporary problem. after that i used include meta tags for verification my site foe webmaster tools, then it says forbidden 403 statement in headers.

anyway my site rejects google crawl. why this happen? can anybody tell me how your site get indexed in Google? what i should have to do? i've posted the contents of my robots.txt file. are they correct inorder to index in google? or any other optimization need?

thank you in advance if anybody help me in this regard.

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
Linux/Apache/My SQL/PHP/DF 9.1.2.1


If you use the robots.txt provided by DF, it is correct.

There are numerous reasons why you are not indexed, the most common being that it takes time, or that your site is blacklisted by Google (for being naughty), or it has been dropped due to being unavailable for a long period of time, or if your site is frequently unavailable. A search of Google will provide greater insights.

Site optimization is a massive area and really is a matter for you to investigate. DragonflyCMS itself provides a very good platform for search engines - site availability is a matter between you and your host, and quality content is a matter for you.

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
Ubuntu/Apache 2.2.22/MySQL 5.6.34/PHP 7.1.22/DragonFly 10.0.48.9418


i found something, the matter is my .htaccess. when i verifying my site at google webmaster tools, it is not verified because google cannot access my site where it seeks .html file i uploaded to the root as stated by google. therefore i removed my .htaccess and started the verifying process again, then it was successfully verified. therefore i have doubt about my .htaccess does not allow to crawl the site by googlebot. however i dont know whether it is correct or not. following is my .htaccess.

can you explain me this .htaccess allow to access my site by googlebot. Thanks in advance.

# CPG Dragonfly CMS # Copyright (c) 2004-2006 by CPG-Nuke Dev Team, dragonflycms.org # Released under the GNU GPL version 2 or any later version # $Source: /cvs/html/.htaccess,v $ # $Revision: 9.18 $ # $Author: nanocaiordo $ # $Date: 2007/01/17 03:24:44 $ # Remove the pound sign on these 3 for production sites # if your server doesn't allow it then a Error 500 is given # php_flag display_errors off # php_value error_reporting 0 # php_flag register_globals 0 # flood protection # deny most common except .php <FilesMatch "\.(inc|tpl|h|ihtml|sql|ini|conf|bin|spd|theme|module)$"> deny from all </FilesMatch> # disable access to config.php and .ht* from a browser <FilesMatch "^(config\.php|\.ht)"> Deny from all </FilesMatch> <FilesMatch "error\.(php|gif)"> allow from all </FilesMatch> # if you use LEO, mod_rewrite is necessary <IfModule mod_rewrite.c> RewriteEngine On # Check for Santy Worms and redirect them to a fail page #------------------------------------------------------------------- # Variant -1 # uncomment if you dont use LWP # RewriteCond %{HTTP_USER_AGENT} ^LWP [NC,OR] # Variant -2 RewriteCond %{REQUEST_URI} ^visualcoders [NC,OR] # Variant -3 RewriteCond %{QUERY_STRING} rush=([^&]+) [NC,OR] # Variant -4 RewriteCond %{HTTP:x-moz} ^prefetch [NC,OR] RewriteCond %{X-moz} ^prefetch [NC,OR] # block local file, sql and remote attacks RewriteCond %{QUERY_STRING} =../ [NC,OR] RewriteCond %{QUERY_STRING} "%20UNION" [NC,OR] RewriteCond %{QUERY_STRING} =http:// [NC] # deny them RewriteRule ^.*$ - [F] #------------------------------------------------------------------- RewriteCond %{REQUEST_FILENAME} -f [NC,OR] RewriteCond %{REQUEST_FILENAME} -d RewriteRule ^(.*)$ - [L] #BizStore RewriteRule ^amazon/(.*)/(.*)-search-(.*)-(.*)-(.*).html index.php?name=BizStore&Operation=ItemSearch&SearchIndex=$1&$2=$3&Sort=$4&ItemPage=$5 [QSA,L] RewriteRule ^amazon/(.*)/(.*)-search-(.*)-(.*).html index.php?name=BizStore&Operation=ItemSearch&SearchIndex=$1&$2=$3&Sort=$4 [QSA,L] RewriteRule ^amazon/(.*)/(.*)-search-(.*).html index.php?name=BizStore&Operation=ItemSearch&SearchIndex=$1&$2=$3 [QSA,L] RewriteRule ^amazon/(.*)/browse-(.*)-(.*)-(.*).html index.php?name=BizStore&Operation=ItemSearch&SearchIndex=$1&BrowseNode=$2&Sort=$3&ItemPage=$4 [QSA,L] RewriteRule ^amazon/(.*)/browse-(.*)-(.*).html index.php?name=BizStore&Operation=ItemSearch&SearchIndex=$1&BrowseNode=$2&Sort=$3 [QSA,L] RewriteRule ^amazon/(.*)/browse-(.*).html index.php?name=BizStore&Operation=ItemSearch&SearchIndex=$1&BrowseNode=$2 [QSA,L] RewriteRule ^amazon/(.*)/review-(.*).html index.php?name=BizStore&Operation=CustomerReviews&ItemId=$1&ReviewPage=$2 [QSA,L] RewriteRule ^amazon/(.*)/(.*).html index.php?name=BizStore&Operation=ItemLookup&ItemId=$1 [QSA,L] RewriteRule ^amazon/(.*).html?$ index.php?name=BizStore&SearchIndex=$1 [QSA,L] # if you use LEO and CPG-Nuke is installed in a sub-directory like '/html', # remove that # before RewriteBase and rename /html to the path of the sub-directory RewriteBase / # RewriteRule ^index\.html /index.php RewriteRule ^([a-zA-Z0-9_=+-]+)(/|\.html)$ index=$1 [L,S=5] RewriteRule ^([a-zA-Z0-9_]+)/([a-zA-Z0-9_]+)(/|\.html)$ index=$1&file=$2 [L,S=4] RewriteRule ^([a-zA-Z0-9_]+)/([a-zA-Z0-9_]+)$ index=$1&file=$2 [L,S=3] RewriteRule ^([a-zA-Z0-9_]+)/([a-zA-Z0-9_]+)/(.*)(/|\.html)$ index=$1&file=$2&$3 [L,S=1] RewriteRule ^([a-zA-Z0-9_]+)/(.*)(/|\.html)$ index=$1&file=index&$2 [L] RewriteRule ^index=(.*[^/])/(.*) index=$1&$2 [N,L] RewriteRule ^index=(.*) index.php?name=$1 [L] </IfModule> # use custom error pages if you wish ErrorDocument 400 /error.php ErrorDocument 401 /error.php ErrorDocument 403 /error.php ErrorDocument 404 /error.php ErrorDocument 500 /error.php # disallow index viewing (like ftp) of directory # Remove # for production sites # Options -Indexes # for hosts that don't allow the above, we won't give people anything to look at <IfModule mod_autoindex.c> IndexIgnore * </IfModule> AddDefaultCharset utf-8 AddType x-mapp-php5 .php

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
Linux/Apache/My SQL/PHP/DF 9.1.2.1


did you ever figure out the problem with the htaccess file? I too have verified that the problem is the apache access file yet I am not clear on how to fix it either.

any input would be great - thanks

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
linux/apache/php5/DF9.2.1


OK FINALLY! well after pulling most of my hair out and going through and testing and playing with just about everything I could think of (robots.txt file, .htaccess file, apache settings, DF settings) ...and since I have yet to find the answer here on the forums, I finally figured it out on my own (I hope lol) ...

In order to resolve this problem, you have to go into your security setting in DF and find the "Unkown User-Agents" option and make that INACTIVE. what's happening is that if your installation of DF does not recognize the spider or bot, then its dissallowed from indexing or crawling it.

Normally this woould be a good thing to keep email harvesters and junk like that out of your website files, but if certain existing or newer bots or bot names are not in your bot list - then guess what? They're blocked too...

In this case, the bot being refused is "Google-Sitemaps" - NOT google or googlebot, so by turning off the "Unkown User-Agents" setting, you can allow this bot and all others too like MSN, speedy_spider, mozilla and others that are also being refused. If someone would like to add these to the security settings - that would be great, but for now, I choose to allow them all as opposed to excluding some of the more important ones while trying to keep out the trash.

Your thoughts?

74.125.16.68 - - [28/Aug/2008:21:02:59 -0500] "GET /google036f218dd7329592.html HTTP/1.1" 200 523 "-" "Google-Sitemaps/1.0"
74.125.16.68 - - [28/Aug/2008:21:02:59 -0500] "GET /noexist_036f218dd7329592.html HTTP/1.1" 404 2516 "-" "Google-Sitemaps/1.0"

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
linux/apache/php5/DF9.2.1


Verifying Website in Google Webmaster tools

Solved on:
dragonflycms.org/Forum...c/t=19316/

try searching ;D

www.greenday2k.net

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):


I had this same problem - running DF 9.2 with the latest ua.inc file.
Had to turn off Unknown user agent check.
Why is this not a known user agent??

Admin - Great Lakes Web Designs
Theme Designer - WebSite Guru Designs
Site Admin - Families with Food Allergies

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
Linux 2.6.27-grsec/Apache 2.2.11/MySQL 5.0.67-community-log/PHP 5.2.8/DF 9.2.1


Its in HEAD and 9.2 branch but the change will be applied on next upgrade.

.:: I met php the 03 December 2003 :: Unforgettable day! ::.

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
CloudLinux / Apache 2.4 LSAPI / MySQLi 5.6 / PHP 5.6 / DCVS


it would be nice if in the general settings on admin section
we can add custom headers or just the KEY for this google app.

www.greenday2k.net

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):


Hi,
Google is getting more complicated than Matrix which nobody understood, but everyone enjoyed watching .

Google specifically tags different websites according to their business logic. As per Google rule, it keeps tabs on number of times it found fresh content on any particular website. If any website update content within a span of 2-3 days, then google crawler will frequently visit that website. Google search crawler is thirsty for good content and is always in search of those websites which produce quality content on regular basis.

bluetooth kopfhörer

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
windows 7

All times are UTC


Jump to: