Jim Posted December 25, 2023 Share Posted December 25, 2023 Google Search Console lists some URLs as not indexed with reference "Soft 404". The URLs are of the form https://www.tubedomainname.com/embed/83622. Why does KVS create these URLs, is it correct. They point to a page on the other tube site. What would be the best solution for these pages? Quote Link to comment Share on other sites More sharing options...
Tech Support Posted December 25, 2023 Share Posted December 25, 2023 Do you mean this link contains other domain name than yours? Then likely you are using 3rd-party embed codes for your videos, and these URLs are embed codes. For example this URL is embed code: https://www.kvs-demo.com/embed/123 By default robots.txt prevents indexing these URLs, as normally they are duplicates of your video pages and they refer your original video pages via canonical: <link href="https://www.kvs-demo.com/videos/123/tdwp-danger-wildman/" rel="canonical"/> Quote Link to comment Share on other sites More sharing options...
Jim Posted December 25, 2023 Author Share Posted December 25, 2023 To clarify: Google Search Console lists some crawled URLs as not indexed with reference "Soft 404". The URLs are of the format https://www.mykvstubedomainname.com/embed/83622. This "embed" URL plays the video full screen which is shown on the page https://www.mykvstubedomainname.com/videos/83622/title-of-video-etc-etc Google has only started discovering these "embed" pages in the last few weeks, now there are 45. Is it the case then that the robots.txt file is not effective in preventing indexing of these "embed" URLs and if so how can it be corrected? Quote Link to comment Share on other sites More sharing options...
Tech Support Posted December 26, 2023 Share Posted December 26, 2023 1 hour ago, Jim said: Is it the case then that the robots.txt file is not effective in preventing indexing of these "embed" URLs and if so how can it be corrected? I don't think robots.txt file is not effective here. It should not index these pages by default. You can check your embed URL in URL inspection like this, for our demo site it shows that its indexing is blocked: Quote Link to comment Share on other sites More sharing options...
Jim Posted December 26, 2023 Author Share Posted December 26, 2023 The embed page is not indexed. However, the real issue is that the page crawl is allowed according to the URL inspection below, wheras for the demo URL above crawling is blocked by robots.txt. It appears robots.txt file is not effective in preventing crawling of these "embed" URLs - how can this be corrected? Crawl Last crawl Dec 25, 2023, 7:12:15 PM Crawled as Googlebot smartphone Crawl allowed? Yes Page fetch Successful Indexing allowed? Yes Quote Link to comment Share on other sites More sharing options...
Tech Support Posted December 26, 2023 Share Posted December 26, 2023 What is your robots.txt contents? Quote Link to comment Share on other sites More sharing options...
Jim Posted December 26, 2023 Author Share Posted December 26, 2023 Would adding this previous suggested googlebot allow have the effect of overiding all the general disallows including ../embed/.. , explaining why robots.txt was not blocing crawling? Otherwise the robots.txt file is the same as the kvs default, so I have removed it. Quote Link to comment Share on other sites More sharing options...
Tech Support Posted December 27, 2023 Share Posted December 27, 2023 If robots.txt is the same as on kvs-demo.com, then it should block indexing embed URLs. If you have removed it, then indexing is allowed for all URLs. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.