Gary Illyes from Google posted on Reddit that he is looking for examples of where GoogleBot might not be behaving. Specifically he wants examples of “cases where refresh crawls are going nuts.” I assume this means where Googlebot is recrawling pages in a very aggressive manner.
There is a complaint on Reddit about Googlebot pinging an old article URL over and over again way too much. Now, it is very likely someone pretending to be Googlebot, but there is a chance it is an issue with Google. The complaint was:
I’m looking into the logs events from a news site, and i got an spike in one specific day in an article published on 2020.
Anyone have an idea on why would Googlebot access so many time in one particular article?
Gary Illyes from Google replied: “Wanna dm me the URL if you could confirm it’s indeed Googlebot? I’m looking for cases where refresh crawls are going nuts and would appreciate this example.”
He also shared how to verify Googlebot to ensure it is really Google accessing those pages.
So I guess if you have examples, share them with Gary on Reddit.
Forum discussion at Reddit.
Update: Later today, Google posted this on Twitter:
Before reporting an Indexing issue, always consult our community forums and check our support documentation, which highlights some additional helpful tools: https://t.co/zAnLObe41yhttps://t.co/IVZUc0hxPB
— Google Search Central (@googlesearchc) August 16, 2021