What exactly is duplicate content?

There is no mystery to what duplicate content is, it is just what the words imply, the same information repeated in multiple places. Anything that you think maybe duplicate content probably is duplicate content. That includes syndicated articles, blog posts, duplicated directory listings, etc. You name it, if it is found in more than one place it is duplicate content.

And yes, in the eyes of Google and virtually everyone else, it IS duplicate content. The Good news is that Google does not penalize your website for duplicate content. They may decide that your entire webpage is duplicate content and they may decide to not index it or may include it in the supplemental index and you may not receive any ranking benefit from duplicate content. However, this is not the same as a penalty. For the most part, Google will filter duplicate content from SERPs and that’s about it.

There are other types of duplicate content that many webmasters seem to be unaware of, primarily canonicalization issues. That is where the exact same webpage is referenced by multiple URLs.

For example the following URLs may all resolve to the same document:
http:// www . domain.com/
http:// domain.com/
http:// www . domain.com/index.html
http:// domain.com/index.html

In the eyes of search engines these are viewed as separate documents and thus duplicate content.

