The error has two parts--one is from sampling, and the other is from everything else--for example, supporters of the different candidates might differ in their willingness to respond, or the formulas for calculating turnout might be be wrong in ways that favored one party over the other. The size of sampling error can be calculated from statistical theory--the other part can't, and has to be estimated from experience.
The combined sample for the RCP average is so large that sampling error is trivial, on the order of 0.1%. The samples for the individual Gallup polls are smaller, so sampling error is a factor. However, it's not nearly big enough to account for the discrepancies. A rough calculation is that the non-sampling error has a standard deviation of about 5. Until 1948, the Gallup poll used quota samples. That period included two big errors--the notorious 1948 election, and the 1936 election, when the poll showed a Democratic lead of +12 (56%-44%), but the actual lead was +25 (62.5%-37.5%). If you confine it to the 1952-2012 period, the non-sampling error has a standard deviation of about 3--that is, candidate leading by three in the polls would lose about one time in six if the errors are normally distributed. Most of the prediction sites seemed to give Clinton about an 85% chance of winning, which was just about right according to this analysis. Someone just going by the polls would have been a little surprised by the outcome, but not shocked.
Ironically, the problem may have been that journalists (and other observers, including me) paid too much attention to other "information" beyond the polls. Similarly, in the Brexit referendum, the perception of the result as a shock wasn't based on the polls, which showed a close contest, but on a general reading of the public mood.