Inside Google Sitemaps: December 2005
Your source for product news and developments www vs non-www versions of a siteTwo URLs to a site -- one that is prefaced with www and one that is not (for instance, http://www.example.com/ and http://example.com/) -- often point to the same location on a server. But depending on the server configuration, they may point to different locations, so search engines can't assume they are the same. This post provides tips for viewing stats if the www and non-www version of your site point to the same location. If you have added your site without prefacing the domain with www (for instance, http://example.com/), and the www version of this domain points to the same location, try adding the www version of the domain (for instance, http://www.example.com/) to your account. You may see a wider variety of stats for the www version of the domain. You can add a site by:
If the verification file still exists in the root of your site and both versions of the domain point to the same location, you can verify the second version simply by accessing the Verify tab and clicking the Verify button. Note that having both versions of the site's URL listed in your account won't affect the indexing of your site as long as you have submitted a Sitemap for only one version - the version you want to be indexed. Don't submit a Sitemap for both versions if the location and content are the same. If both domains point to the same location and you have pages indexed under both versions, see our Google Help Center for more information on consolidating the listings under one domain. We hear requests for help with this often, so we'll be looking at ways to improve this issue in the coming months. Verifying a site located in a subdirectoryIf your site is located in a subdirectory, rather than at the root level of a domain (for instance, at http://www.example.com/site/), you are given two choices when you verify. You can either verify at the subdirectory level or at the root level. If you verify at the root level (http://www.example.com/), we can show you a greater variety of statistics. If you are unable to upload a file at the root level, you can still view error information for the subdirectory at which you've verified, as well as information about your Sitemaps. Verification does not affect your Sitemap submission or the crawling or indexing of your site. If you verify at the subdirectory level, some statistics are not available. Instead, some stats pages provide a link for verifying at the root level. If you have root-level access, simply click the verify link, place the requested verification file in the root directory and then click Verify. Some questions you may have about verifying at the subdirectory level: I verified successfully at http://www.example.com/site/. But when I access my stats, I see a message asking me to verify again. Why is it asking me this since I already verified? That message is providing a link so you can verify at the root level. Rest assured that your site is still verified at the subdirectory level and you can access all information that we have available at this time for subdirectories (index stats, Sitemap details, and site errors). I verified successfully at http://www.example.com/site/. But I can still see the Verify tab. When I access that tab and click "Verify", I get a message that Google couldn't find my verification file and that my site isn't verified. The original verification file is still present at http://www.example.com/site/, so why am I getting this message? If you are successfully verified at the subdirectory level, the Verify tab will still appear so that you can later go back and verify at the root level. When you are already verified at the subdirectory level, we look for the verification file in the root of the domain when you try to verify again. You'll see an error message if we don't find it there. I can't verify at the root. Will this affect my listings in Google? No. Verification at the root lets you see a greater variety of statistics for your site. It does not affect how we use your Sitemap, how we crawl your site, your PageRank, or any other factor. A previous version let me verify at the subdirectory level. Why did you change it in this version? You can still verify at the subdirectory level as you could before and see everything you could before (everything listed under the Sitemaps tab and Errors tab). We've added the option of verifying at the root, which lets you see root-level stats. More query stats; verification enhancementsWe've made a few improvements that we wanted to let you know about. Expanded query stats We've expanded the number of query stats that we show you (both top search queries and top search query clicks). Note that the exact number of stats you see depends on how often your site comes up in search results. A large site that has been around for a while will generally have more stats than a new, smaller site. Verification enhancements We've also been working on the verification process. Last month, we posted to the Group that we were looking for your input on how to improve this process. Yesterday, we told you that we added support for lowercase verification filenames. Now, we've added another improvement. Some of you have received one of the following messages when you try to verify: Our system has experienced a temporary problem. Or, The system is currently busy. Please try again in a few minutes. In the past, you've had to manually try again later. Now, instead of seeing one of those messages, you'll see this message: The system is currently busy. We will process this verification as soon as possible. Please check back later for an updated status. We'll add your verification request to a queue and try it for you later, so that you don't have to keep trying manually. If you receive another error that might be due to a temporary issue (for instance, if we can't reach your server due to a DNS timeout), we'll show you the error, but we'll also add your verification request to the queue.
Lowercase verification filenamesWe've been looking into ways to enhance our verification process and noticed that some members of our Group have been unable to upload case-sensitive filenames to their webservers. To help them, we now check for a lowercase verification filename if we don't find the case-sensitive one. So now you can upload a verification file with the lowercase filename. (If you have already verified, you don't need to do anything new.) Thanks to our Groups members for their input -- and keep that feedback coming! New version of Sitemap GeneratorWe recently uploaded a new version (v1.4) of the the Sitemap Generator tool. This version has the same features as the last one, but fixes a subtle bug in writing GZip compressed Sitemap files. The old version stored more path information than it needed to when it created GZip files, and this was a point of concern for some webmasters. The bug was found, and the bugfix suggested, by members of the Sitemaps community. Thanks for bringing it to our attention. Trouble with verificationSome of you have had one of the following responses when trying to verify: "Our system has experienced a temporary problem." or, " The system is currently busy. Please try again in a few minutes." However, when you do try again later, you continue to get one of these messages. This is a known issue and we are working to resolve it as quickly as possible. Thanks for your patience. If you don't see the full range of statsThe stats we show you are all about what Google knows about your site. If some stats aren't available, it's probably because we don't know a whole lot about your site yet. As we learn more about it (from your Sitemap and our crawling mechanisms), we'll have more information to show you. If your site is new to our index, it doesn't yet have a lot of pages indexed, or if not many searches have brought up results to your site, we may not have a lot of stats available yet. You can easily get an idea of your indexed pages by accessing the Index stats page and clicking the link in the Indexed pages in your site row of the table. Over time, our knowledge about your site will increase and we can share that with you. Third-party programsWe've just updated the list of third-party programs that you can use to create Sitemaps. You can check out the current list on code.google.com. That page also lists the Sitemaps Google Group for each language we support. If you've written a tool that supports Google Sitemaps, let us know at code+sitemaps@google.com. The Sitemaps team is thankful for our great user community and appreciates all the work that's gone into these tools. Sitemaps in JapaneseLast week, we added support for Japanese. As with the other languages we support, if you already use Google in Japanese, you should see the user interface and documentation in Japanese automatically. Otherwise, you can click the Preferences link from the Google home page and choose Japanese from the interface list. We have also added a Japanese Sitemaps Google Group. Site VerificationThis morning we learned of an issue with the Google Sitemaps tool that may have temporarily enabled users to view statistics about sites they do not own. We acted quickly and fixed the issue. To ensure the security of all sites using the Google Sitemaps tool, we will re-verify all sites added in the last 48 hours. When we first started showing statistics a couple of months ago, we put a system in place to prevent anyone other than site owners from seeing stats for a site. We ask each site owner to place a unique file on the site and then we check to see if that file exists. When we do that check, we first make sure that the server isn't misconfigured to return a valid page when a request is made for a page that doesn't exist. We only verify sites that are configured correctly. You can read more about that process in our documentation. Unfortunately, with our latest release, a bug prevented this process from working correctly. We fixed this as soon as we found out about the problem. We take your privacy very seriously and are currently investigating other approaches to further enhance security. More stats!We've just launched some new features. The biggest change for those of you already using Sitemaps: if you've verified your site, you'll see substantially more stats and error details. The biggest change for new users: you can now add a site to your account even before create a Sitemap for it. Once you've added and verified your site, you can see all these new stats and errors. New stats: how the Googlebot views your site The new stats we're showing you are all about letting you know how we see your site. With query stats, we show you the top Google search queries that return pages to your site as well as the top queries that caused users to click on your site in the search results. With crawl stats, you can see how we view crawled pages. You can see a distribution of the pages successfully crawled and the pages with errors as well as a distribution of PageRank for the pages in your site. Page analysis shows you what we detect about the content and encoding of your pages. Index stats provide an easy way for you to use our advanced search operators to return results about how we see the indexed pages of your site. Mobile stats You can now verify your mobile sites and see stats for them. More detailed errors Now you’ll have more details about problems we had crawling your site. We report on 40 different types of errors in 5 categories. Adding a site Even before you have a Sitemap for your site, you can still take advantage of all the statistics and error information we have available. Simply create a Sitemaps account and add a site to it. Once you verify site ownership, we provide the full range of statistics and error details. Of course, we encourage you to add a Sitemap so we can learn more about your site. Changing domainsFrom our Google Group: I've moved my site to a new domain. Can I submit a Sitemap to tell you to index the new site rather than the old site? Submitting a Sitemap for the new site is a great first step, because that helps us learn about the new pages right away. Make sure you place the new Sitemap in the root directory of the new site as the Sitemap must be located on the same domain as the site URLs contained in it. Another important thing to do is redirect visitors from the old site to the new one. Put a 301 (permanent) redirect on every page of the old site to point to the corresponding page on the new site. You can find out more about 301 HTTP redirects from RFC-2616 and you can learn more about how to make the site move a smooth one from our Google Help Center. URLs with HTTP errorsWhat can you do about URLs that we tried to crawl but couldn't because we received a 404 (not found) error? (You can see these once you've verified your site by clicking the stats link beside the Sitemap name on the My Sitemaps page.) You don't have to do anything about them. We'll continue to crawl and index your site and will simply skip pages that return a 404. But here are some things you can do. If we found the URLs from your Sitemap, the fix is simple. Just modify your Sitemap to list the correct pages and resubmit it. If we found the URLs by following links, the fix isn't quite as easy. In fact, in some cases, there may be no fix. A webmaster may have liked your site and tried to link to it, but mistyped the URL. You can look for sites that link to your pages and ask webmasters to fix any broken links, but if that sounds like a lot of work, you can instead just focus on your own site. Check the links in your site You may not be able to control inbound links from other sites, but you can control internal links. Make sure that none of these broken links are coming from your site. You can generally check your webserver logs to see what visitors clicked on in your site that returned 404 errors. If the links are outdated It could be that a link points to a non-existent page because that page used to exist, but no longer does. In that case, you can:
In order to use this system, the outdated page must return a 404 (and if the URL is showing up on your Sitemaps Stats page, it already does). Log in and then choose the Remove an outdated link option. Type in the URL,choose anything associated with this URL and click Remove outdated link. The link will show up in a status area as pending. The page should be removed from the index within 3-5 days and the status will be updated. What to do when your Sitemap status is "Denied URLs"If your Sitemap status is "Denied URLs" and the error listed is "URL not under Sitemap path", here are some things to check. Make sure the URL root matches If you submit your Sitemap using the path http://example.com/sitemap.xml, then the URLs in your Sitemap should begin with example.com. Any URLs that begin with www.example.com aren't considered to be under the Sitemap path. Along those lines, if you Submit your Sitemap using the path http://www.example.com/sitemap To fix this problem, you can either edit the URLs listed in your Sitemap file to match the submitted path, or you can delete the Sitemap and then submit it again using the path that matches the URLs listed in it. Make sure the Sitemap is at the highest-level directory If you submit your Sitemap using the path http://www.example.com/sample_folder/sitemap.xml, then all URLs in that Sitemap must begin with that path. This means that http://www.example.com/sample.html wouldn't be considered a valid URL in the Sitemap. If all possible, place your Sitemap at the root location of your site to avoid these types of problems. If you can't place the Sitemap at the root, then list only URLs from the Sitemap location and lower. See the Sitemaps documentation for more details. When your site changesHere's another question we've gotten. If you have a question about Sitemaps, let us know by posting in the Sitemaps Google Group. I've submitted a Sitemap for my site. What should I do when my site changes? If you add pages to your site, you can let us know in several different ways:
If you've deleted pages from your site, you can delete those pages from your Sitemap and then resubmit it (either manually through your Google Account or by pinging us). Including site pages in a SitemapFrom time to time, we use this blog to answer some common questions. Here's one: Do I have to include every URL from my site in my Sitemap? If I don't include some of them, will they be excluded from the Google index? Your Sitemap provides us with an additional way to learn about your site. We still use all of our other methods, such as following links from your site's HTML sitemap and from pages that link to you. We discover URLs that you don't include in your Sitemap through these regular crawling processes --it just may take us longer, and we won't have any extra information that you can provide in a Sitemap (such as priority, last modification date, and change frequency). We won't exclude URLs that you don't list in your Sitemap from the Google index. Searching what Google knows about your siteSo, you've submitted your Sitemap. How can you tell what Google knows about your site? You can use Google’s advanced search features to get a list of:
You can do many of these advanced searches (amazingly enough) by clicking the Advanced Search link on www.google.com. You can also use our advanced search operators in your query. If you are using operators, remember that there should not be a space between the operator and the URL. Note that we use brackets to indicate the words in the search box. The query itself should not include the brackets. Results from your site (site: operator) To find pages from your site, use the site: operator. For instance, [site:www.google.com]. You can also type the URL in the Domain field of the Advanced Search page. You can also use this feature to search through any site. Simply enter the search query followed by the site: operator and the site you want to search through. For instance, to search for admission information on the Stanford University web site, type [admission site:www.stanford.edu]. And, you can use this feature to search through sites from specific top-level domains. For instance, to search for information about Zürich on Swiss sites in the .ch domain, type [Zürich site:.ch]. Pages that link to your site (link: operator) To find pages that link to your site, use the link: operator. For instance, [link:www.google.com]. You can also type the URL in the Links field of the Advanced Search page. Pages that refer to your site’s URL (allinurl: operator) To find pages that include your site’s URL, use the allinurl: operator. For instance, [allinurl:www.google.com]. Information Google has about your site (info: operator) To see information that Google knows about your site, use the info: operator. For instance, [info:www.google.com]. That query results in the following: Some questions you may have about these results... I submitted my Sitemap but I don't see all the pages listed. When will they be indexed? We can't guarantee if or when we'll index pages we receive from Sitemaps submissions. We use Sitemaps as another view into your site to augment our regular crawling methods. Also, we don't want the Googlebot to overwhelm your bandwidth, so we may not crawl it all at one time. Some of my results are labeled "Supplemental". What does that mean? That means that pages are part of our auxiliary index. You can read more about that in our webmaster guidelines. Verifying your siteVerification Once you verify your site, we show you additional statistics. We require verification to make sure that we only show these stats to site owners. When you verify, we ask you upload a unique text file to a particular directory on your webserver. Periodically, we check to see if this file still exists. We do this to make sure you still own the site. If we can’t find this file when we recheck, we ask you to verify again. Questions you might have about verification: Once I’ve verified my site, can I delete the verification text file? You can delete it once you’ve verified, but when we do our periodic check for it, we’ll ask you to upload it again. Someone who used to have write-access to my site no longer does. How can I make sure this person can no longer see the stats for my site? Simply delete the verification file. When we do our next periodic check, if that person tries to see stats for the site, we’ll ask for the file again. Since that person no longer has write access to the site, we won’t find the file and won’t show the additional statistics any longer. But I still want to see stats for the site. How can I do that if my site is no longer verified? We ask for a different verification file for each Google Account. You can log in with your Google Account and still see statistics. The person who uploaded the Sitemaps for my site no longer has write access. I don’t have access to that person’s Google Account. How can I see information about my site? Simply log in with your own Google Account and submit your Sitemaps using that account. We’ll ask you to upload a verification file that is unique to your Google Account so that you can see additional stats for your site. All new!We’ve added some new features to Google Sitemaps. Date stamps for statistics For information we provide once you’ve verified your site, we now let you know when we tried to crawl the URLs we tell you about. Enhanced support for special characters in URLs Note that the Sitemap URL must be encoded for readability by the webserver on which it is located. In addition, it can contain only ASCII characters. It can't contain upper ASCII characters, certain control codes, or special characters such as * and {}. If your Sitemap URL contains out-of-range characters, escape them when you submit the URL. Otherwise, you'll receive an error when you try to submit it. You can find more information on escaping out-of-range characters by doing a Google search for [html escape codes]. All URLs must follow the RFC-3986 standard for URIs and the RFC-3987 standard for IRIs. Documentation updates We’ve updated the documentation for these new features, as well as added information about the latest version of the Sitemap Generator script and about OAI-PMH submissions (both of which we talked about in earlier blog posts). We’ve also provided some information about errors you might come across when you submit a Sitemap. All we’ve made these updates in every language for which we provide documentation. Resolved issues And we’ve resolved two issues with this release that you brought to our attention in the Google Group.
If either happen to you, or if you experience any other trouble, please let us know by posting in the Google Group. Several of these features were a direct result of your feedback. Once again, we appreciate your input during our beta period. We show you moreJust about a month ago, Google Sitemaps added new statistics about problems Google encountered crawling your pages. This stats page showed you up to three URLs we had trouble accessing for each type of error. You asked for more. So we’re giving you more. Now, once you verify your site, we’ll show you up to 10 URLs we’ve had trouble accessing for each type of error, for a maximum total of 60 URLs. Keep posting your suggestions to our Google Group and we'll keep listening. Thanks for your participation during our beta period. How is a Google Sitemap different from an HTML sitemap?A Google Sitemap is an XML file that uses the Sitemap protocol. This file lists URLs in your site, along with optional descriptive information about those URLs (such as when they were last updated and how often you modify them). You can create this XML file using our Sitemap Generator or a third-party tool. Google Sitemaps are intended for processing by the Google Sitemaps program. An HTML sitemap is intended for users of your site. Generally, this type of sitemap provides links to the pages in your site, and may provide descriptions of those pages. We encourage the use of HTML sitemaps. They make it easier for users to navigate your site. Also, as we talk about in our webmaster guidelines, a clear hierarchy with text and links helps us index your site. You can’t submit an HTML sitemap to the Google Sitemaps program. However, if you are unable to create or generate a Google Sitemap file in the Sitemap protocol format, you can submit a text file that lists URLs in your site. Using OAI-PMH with Google SitemapsIf your site uses the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) 2.0 protocol, an application-independent interoperability framework based on metadata harvesting, you can use your OAI repository as your Sitemap. Simply submit the baseURL of your OAI repository (for instance, http://www.example.com/oaiserver ). When we query the baseURL , we automatically add query parameters (such as ?verb=Identify or ?verb=ListRecords ), so you can simply submit the baseURL itself. When we extract the URLs for your site, we expect the records in the repository to be formatted using Dublin Core, with the URLs embedded in <dc:identifier> tags. Below is a sample record that includes the <dc:identifier> tag in bold. The URL listed in that tag is what we extract.<oai_dc:dcAs with other Sitemaps, the URLs must be within the same site and at the same directory location or lower than the baseURL . For instance, if you submit http://www.example.com/oaiserver as the baseURL , the following URLs would be valid:http://www.example.com/However, if you submit http://www.example.com/dataprovider/oaiserver , then none of those URLs would be valid.
Combining Sitemaps into one larger SitemapDo you have several small Sitemaps that you would like to combine into one larger one? With version 1.3 of the Sitemap Generator, which we told you about yesterday, you can do just that. This version includes a new input method: To use this input method, locate the The <-- ** MODIFY or DELETE ** This section gives one example. You should replace this example and include an entry for each Sitemap you want to include. Ensure that the <sitemap path="/var/www/docroot/subpath/sitemap*.xml">The Sitemap Generator extracts all URLs and the optional data listed for each URL for every Sitemap you list and creates one Sitemap with this information. At this time, we can't guarantee that this method will work for Sitemaps created with tools other than the Sitemap Generator. Announcing Sitemap Generator version 1.3: Improved encoding supportThe Sitemap Generator version 1.3 is now available and provides improved encoding support. If your webserver uses an encoding other than UTF-8 or if your domain name or some the URLs in your site use non-ASCII characters, and you plan to use the Sitemap Generator to create your Sitemap, you should download this latest version. Generally, non-ASCII URLs should be encoded using UTF-8 before being percent-escaped. However, some webservers respond correctly only if URLs are encoded specifically for the webserver's configuration. All URLs within your Sitemap, as well as the URL of the Sitemap itself, must be encoded for readability by the web server on which they are located. If you are using the Sitemap Generator, you can specify the encoding of the URLs contained in the Sitemap from within the config.xml file. Within the site definition section of that config file, use the optional default_encoding attribute to specify the encoding used by your webserver. If you don't use this attribute and your webserver uses an encoding other than UTF-8, the Sitemap Generator can't know which encoding to use, although it does attempt to determine the correct encoding. If the generated Sitemap doesn't list the URLs correctly, you should explicitly indicate the encoding with the default_encoding attribute and run the Sitemap Generator again.If your URLs contain non-ASCII characters, we recommend that you run the Sitemap Generator script using Python 2.3 or higher. This version of Python has increased non-ASCII support. If your domain name contains non-ASCII characters, you must use Python 2.3 or later, as Internationalizing Domain Names in Applications (IDNA) support wasn't added until this version. Without IDNA support, the Sitemap Generator can't correctly encode a non-ASCII domain name. Google Sitemaps in your languageWe’ve just made our Sitemaps user interface and documentation available in ten additional languages. We have also set up Google Groups for each one. The languages available are: Brazilian Portuguese Dutch French German Italian Korean Russian Simplified Chinese Spanish Traditional Chinese UK English US English If you already use Google in one of these languages, you should see the change automatically. Otherwise, you can click the Preferences link from the Google home page and choose one of these languages from the interface list. As always, you can submit a Sitemap for sites with content in any language. Verifying your site: trouble with 404 pagesYou want to verify your site so you can view additional statistics. You click the verify link beside the site on the My Sitemaps page, create the file we ask for, upload it to your server, and click the Check Status button. And then you see this error message: We've detected that your 404 (file not found) error page returns a status of 200 (OK) in the header. What should you do? This error means that we've detected that your server returns a status of OK when the requested file is not found. This is the same status that the server returns when the file exists. When we look for the verification file, we can't tell if your server is returning a status of OK because it finds the file, or because it can't find the file. This means we are unable to verify your site. Modify your web server configuration to return a status of 404 (file not found) in the header of 404 pages. If your site is hosted, ask your hosting company to do this. Make sure that if your server returns a custom error page when a requested file is not found, that page returns a 404 status in the header. And make sure that the server doesn't redirect requests that return "file not found" to a valid page of your site, such as your home page. This configuration returns a redirect status code (such as 301 or 302) rather than the correct 404 status code. You can read more about http status codes here. If you don't have a mechanism for checking the headers that your server returns, you can do a search for terms such as [check server header tool] to find online tools that will check this for you. Once your web server is configured correctly, try to verify your site again and we'll check the configuration. Submitting mobile SitemapsIf you've created and submitted Sitemaps for your non-mobile pages, or just want to submit a mobile Sitemap for the first time, here are a few helpful tips to help you get started: Identify your mobile Sitemaps content
Mobile pages and new statisticsWe just launched two new features. Mobile Sitemaps You can already use Google Mobile Web Search on your mobile phone to search through sites that have been specifically designed for mobile phones, PDAs, and other handheld devices. We add new sites to our mobile web index regularly. You can help users find your mobile webpages by letting us know about those pages. Google Mobile Sitemaps lets you submit Sitemaps for URLs that serve mobile content. You create and submit a mobile Sitemap much in the same way you do other Sitemaps: with the Sitemap Generator, the Sitemap protocol, or via a syndication feed or text file. The biggest difference with a mobile Sitemap is that you have to submit a separate one for each markup language. Right now we support:
New Site Statistics Now you can see information for the URLs in your site that we've had trouble accessing – both for URLs from your Sitemap and those we've discovered during a regular crawl. We won't show you these additional statistics until you verify your site, which is a very simple process. Click the verify link next to the site on your My Sitemap page, create an empty file using the name we specify, and upload it to the folder where your Sitemap is located. We'll check to see that the file is there, which tells us that you have permission to upload files to that site, and then we'll show you the information. What URLs should a Sitemap include?So, what URLs from your site should you put in your Sitemap? Put in everything! List the URLs that contain your content, images, media, and anything else in your site. If you want to include only a subset of items, you can, but we’d like as much information about your site as you can give us. Remember that we respect robots.txt, so if you include any URLs in your Sitemap that are restricted in robots.txt, we won’t crawl those. What's in a name?How should you name your Sitemap? What extension should you give it? The short answer is that you can name your Sitemap anything you want. You can use any extension. Just submit the URL to us, and we’ll go pick it up. The better answer is a little longer. We recommend that you give your Sitemap an extension that identifies the file type. For instance, if you create a simple text file that lists URLs, we suggest giving that Sitemap a .txt extension. If you create an XML Sitemap that uses our Sitemap protocol, give it an .xml extension. If you compress that file using gzip, give it an .xml.gz extension. If you use our Sitemap Generator to create a Sitemap, you specify the resulting Sitemap name in the config.xml file. The default name is sitemap.xml.gz. If you keep the .gz extension, the resulting Sitemap file will be compressed. If you change this name to have an .xml extension, the resulting file will not be compressed. We suggest you compress the file so that your webserver will take less of a bandwidth hit when we download it. You can submit the URL of a script that dynamically generates an XML Sitemap when we download it. That script might have an extension such as .asp or .php (depending on the script type). The extension of the file isn’t a problem, but if your script takes a long time to run, the delay will look like a server timeout and we’ll try again later. If you have trouble getting this type of Sitemap submitted, make sure your script is responsive. Also ensure that your webserver doesn’t automatically add things (such as HTML headers and footers) to the generated files, since that would cause the resulting XML file to have parsing errors. One more thing about naming. You can name your Sitemap anything you want… almost. You can’t name it robots.txt. And if you use a robots.txt file for your site, make sure that it doesn’t restrict our access to your Sitemap file. Using Sitemap Index FilesSeveral of you have asked us, Should I submit Sitemaps or Sitemap indexes for my site? If you have a small site, you probably don't need to use a Sitemap index file -- you can just list all of your URLs in one Sitemap. If you have a larger site, you may want or need to have multiple Sitemaps for your site. In that case, you can make submitting and tracking easier by listing the Sitemaps in a Sitemap index file. You must use multiple Sitemaps for your site when:
You can also have an index of Sitemap index files. A Sitemap index file can be a maximum of 10MB as well, so if you have a really large site, you may have to use this additional organization step to keep the file sizes to a manageable level. We have a size limitation for Sitemaps and Sitemap indexes so that when we download the files, we don't overwhelm your bandwidth. Compressing your Sitemap index file Speaking of being considerate of your bandwidth, if you can, you should compress your Sitemaps and your Sitemap indexes using gzip. If you're not familiar with gzip, keep watching this blog. We're putting together some helpful instructions. If you compress your Sitemap index file, you'll probably want to give it an .xml.gz extension. If you don't compress your Sitemap index file, you'll probably want to give it an .xml extension. Submitting your Sitemap index file So, you've got some individual Sitemaps that you've listed in a Sitemap index file. What now? Just Sign into Google Sitemaps and submit the Sitemap index file. You don't need to submit individual Sitemaps that are included in the index. Once we've processed your Sitemap index file, we'll let you know if we found errors in the Sitemap index itself, or in any of the individual Sitemaps. If you make changes to a Sitemap included in a Sitemap index file you've submitted, just change the lastmod date for that Sitemap in your index. During this beta period, feel free to resubmit the Sitemap index file. Just getting started...We are stoked that so many of you are trying out Sitemaps while it's in beta. When we read on the Sitemaps Google Group that you'd like an easy way to find out about new features and where we're headed with this thing, we realized we could put our very own Blogger to good use. And since the Sitemaps team is far-flung in Mountain View, Kirkland, and Zurich, this will be a good way to keep everyone posted about new features and developments. Subscribe to our feed to keep up with the latest. We'll also e-mail each blog post to the Sitemaps Google Group, so if you get your news from there, you won't miss out on a thing. In addition to telling you about new features here, we'll also do our best to address some of the frequently asked questions from the Google Group. Lately, some of you have wondered if submitting a Sitemap could actually reduce the number of your site pages we have indexed. No, submitting a Sitemap will not reduce the number of indexed pages for your site. As we note in our information for Webmasters, each time we update our database of webpages, our index shifts: we find new sites, we lose some sites, and sites' rankings change. If your site was dropped from Google and you haven't made major changes to it, we'll likely pick it up again soon. If your site's ranking changes, ensure you are following our guidelines. When you submit your Sitemap, you help us learn more about the contents of your site. Participation in this program will not affect your pages' rankings or cause your pages to be removed from our index. Copyright © 2005 Google Inc. All rights reserved. |
|