It's not that cut and dry -- it's highly dependent on the content type, intermediate processing, and the person watching. While you're absolutely correct humans don't directly see interlacing, it's not 100% seamless -- if you show fast motion content like panning aerial shots from a nature show or a fast sport with left-right panning like basketball, soccer, or hockey most people will identify the interlaced video feed as providing a "soft" picture because interlacing artifacts are viewed as losing edge detail.Fortunately, we humans were endowed with a brain that blends interlaced screens together so that we can't perceive this phenomenon without the aid of a still frame to show us what was happening.
Since CBS at 1080i is typically rated as better quality than ABC and Fox, the progressive scan argument is pretty much lost.
For most of primetime TV, however, where you have shows set against relatively static backdrops, yeah - the extra resolution is going to produce a better looking picture.
Then you still have the wildcard of TVs that have advanced motion processing (ie, 120Hz / 240Hz TVs) that will calculate intermediate frames to smooth out the screen updates further. As detection algorithms have gotten better over the years, newer TVs can detect things like panning shots and completely neutralize interlacing artifacts for everything but the side of the screen where new content is showing up on a pan.
All that said, in general, progressive full-frame video at 1280x720 resolution tends to look better on content with more on-screen pixel changes than 1920x1080 interlaced video to a wider majority of people on a wider majority of devices. As you go down the scale from 100% on-screen pixel updates per second to 0% on-screen pixel updates per second there is a crossover point where 1080i runs away with being the better choice. Every network tries to decide which side of that crossover point most of their content fits within, and chooses a broadcast resolution accordingly.