I'm not qualified to discuss this from the scientific side.
But from what I read, the science is sound (no pun intended ), and I do know what BS science is...
My understanding:
Jitter means clock issues. The clock data is NOT transfered.
(There was an article referenced some years ago on AVS about how Meridian tackles this...)
The signal (zip) arrives bit perfect and is unzipped bit perfect (no decryption issues).
DA conversion with the player's clock introduces jitter.
As a side note: the zip analogy isn't quite right here.
Audio/video playback is a time sensitive stream while zip/unzip applies to a file outside the time domain.
Diogen.
Sigh. There may not be much point unless we can discuss on a technical level, especially if you say 'don't get technical' and then come back with technical issues.
I don't agree that the clock is local. Actually, the clock is distributed, but not coupled. There is a data clock involved in sending the bitstream out, and one at the receiving end (receiver or television for audio/video) to decode the signal. The jitter you speak of is the change in skew between the two. Either side can introduce it. On the source side, a bit can be sent out early or late. On the receiving side, the signal can be sampled at the wrong time. There are synchronizing sequences built in that are designed to periodically get the clocks back in synch.
The jitter can introduce a bit error if it samples before or after a transition takes place. However, the system was designed with good margins and oscillators are pretty good these days. It would take a very poor design to exceed the sampling window. Just shouldn't happen with a clean signal.
However, little of this relates to your diagram, except in one point. If the cable has a high capacitance, it will have a longer rise and fall time and the receiving end can sample before (pos transition) or after (neg transition) the signal has crossed the threshold. This would introduce a bit error, but it would have nothing to do with jitter. That would tend to show too many samples at one or the other parity, thus introducing the macroblocking in video or pops in audio, not frequency loss.
I guess we could introduce jitter into the equation if we figured that early or late sampling only happened 'sometimes'. In that case, the bit error rate would be less predictable. If the early/late jitter was predictable, you might see a high frequency loss but it would be difficult to predict it in normal consumer operation. You certainly could create it under lab conditions, but that isn't real life.
Again, I need to read second level material. The link you provided is two experts arguing, and each (politely) calling the other one a damned fool. It is very difficult to take sides with all the jargon and marginaly valid experiements taking place.
I think I will drop back out of this now. I think this is religious, and I really don't want to get into a theology discussion.