They will calculate the number of copyright infringements based on the amount of DMCA takedown notices that Google receives from content owners. Here’s a quotation from a Search Engine Land article explaining the process:
But as it turns out, there is a way that Google can guestimate if there’s copyright infringement happening, by making use of Digital Millennium Copyright Act “takedown” requests.
These requests are one of the ways to get content removed from Google. Anyone can file a request. It’s not proof of copyright infringement. It’s merely an allegation, and one that can be challenged. But Google evaluates each request, and if deemed valid, content is removed.
The requests are a pain to file, and they only remove an individual web page. If you’re a big entertainment company, it’s like playing Whac-A-Mole. But now, Google’s shift will change the game from a page-by-page basis to a site-by-site one. Beginning next week, a site will a lot of requests against individual pages will find all of its pages ranking lower in Google.