Encoding parameters - fourth test

Introduction

For this fourth test I use a 4 minutes sequence of a recent action movie which has everything for a challenging test: dark scenes, bright scenes, slow motion, smoke, rain, fire, water surface, splashes and some particle effects. The raw source is progressive and rather clean from noise.

Good news: I got hold of a dual Xeon PC. Bad news: it's running XP. The MEncoder binary for Windows is getting old and its performances (both speed and quality) are falling behind. However, I expect the test conditions to be relevant for base parameters. This is an opportunity to explore a rather large number of combinations in a reasonable time (2 weeks!!!).

From previous experiences and other short tests I found that a 3 pass encode could significantly improve quality while not taking much longer time, when using "turbo" during the first pass. Surprisingly, using the turbo for just the first pass would in fact improve quality. I have also noticed that B-frames could prove rather destructive during scenes involving smoke or some kind of transparency.

As I wrote earlier, precmp should be tested afterwards. But in this test I will assume all parameters I usually adjusted afterwards so, for example, precmp will be set to 3. The following parameters remain constants during the tests:

turbo is only active during the first pass. And since I had to run that test on XP, I also had to write a new test script (Warning: this links directly to a VBScript and the file has the "txt" extension instead of "vbs" to prevent mistakes of our Windows friends ;-).

Motion estimation, motion search range and B-frames

This time I will test the combinations of 5 parameters: max_bframes, cmp, subcmp, predia and dia. For cmp and subcmp, I will try the functions SAD, SSE, SATD and DCT numbered 1, 2, 3 and 4 respectively as before. For the diamonds sizes I will try -3, -2, -1, 1, 2, 3 and 4 respectively numbered from 1 to 7. B-frames will be either limited to 0 or to 1. There are 1568 combinations to test.

The following table shows the best results sorted by average PSNR. There is no "user time" because I don't know how to do that on XP.

Without B-frames, the combination cmp=2:subcmp=6 a clear winner (it's the first time DCT is giving good results for me) and dia=4 is popular. The choice of predia is more difficult but predia=-1 or 4  is well represented at the top.

With B-frames, higher average PSNR can be reached. But the standard deviation is larger and both min. and max. are lower. The combination cmp=2:subcmp=3 is popular as usual and dia=4:predia=4 is winning. As suspected after the third test, the quality increases with the diamond sizes. Surprisingly, cmp=1 is also well represented. 

no B-framesmax 1 B-frames
setLog PSNRsetLog PSNR
Min.MeanMax.SDMin.MeanMax.SD
247738.7644.96659.52.041237738.0744.98358.212.11
243738.7644.96659.552.04227738.0544.98158.192.111
241738.7444.96659.732.047337738.144.97158.072.11
246738.7544.96559.652.048327738.0744.9758.032.109
247538.7544.96459.572.047133738.1544.96958.232.098
247338.7644.96459.472.044337338.0844.96858.032.111
246338.7544.96459.62.049137638.1544.96858.192.099
243638.7544.96459.532.047136738.1544.96858.392.1
241338.7444.96459.692.048135638.0644.96858.352.099
347738.7644.96359.542.045133638.0644.96858.322.099
247638.7444.96359.622.047132738.1544.96858.372.098
245738.7544.96359.542.054131738.1544.96858.352.099
243338.7444.96359.652.046131638.0744.96858.282.1
241638.7544.96359.552.052231738.0944.96758.542.117
347538.7544.96259.52.047137738.1544.96758.162.097
346738.7544.96259.572.052136638.0744.96758.252.099
341738.7444.96259.612.046136538.0544.96758.312.1
242638.7544.96259.72.057135738.1544.96758.352.099
347638.7544.96159.512.047134738.1544.96758.112.098
347338.7444.96159.552.048132638.1544.96758.292.101
343738.7644.96159.582.046124738.1244.96758.022.093
342738.7544.96159.632.048121738.1244.96758.362.099
245338.7544.96159.542.052233738.0844.96658.132.113
243538.7444.96159.612.049135538.0444.96658.362.102
241538.7444.96159.682.053135338.1544.96658.362.098
345738.7644.9659.592.05134638.1544.96658.252.099
344738.7544.9659.512.052134538.1444.96658.312.096
341338.7544.9659.612.047131338.1444.96658.272.095
247438.7444.9659.492.047127738.1344.96658.142.098
243438.7644.9659.562.049126738.1344.96658.262.099
242338.7544.9659.562.045125638.1244.96658.252.1
346338.7444.95959.462.052124638.1244.96658.292.1
343338.7644.95959.592.046123738.1344.96658.122.097
243138.7444.95959.632.052123638.1244.96658.092.092
242738.7544.95959.672.045122738.1244.96658.32.097
347438.7544.95859.772.049122638.1244.96658.32.099
344638.7544.95859.642.055327338.0644.96557.992.111
343638.7544.95859.432.047235738.0844.96558.372.117
346638.7544.95759.632.049221738.0744.96558.442.116
346538.7444.95759.672.05137538.0444.96558.252.099
345638.7544.95759.622.054133538.1344.96558.322.1
345338.7544.95759.562.048132538.1344.96558.222.095
344338.7444.95759.762.054131538.0444.96558.282.099
343538.7644.95759.492.046127638.1144.96558.192.097
342438.7544.95759.572.06126638.1244.96558.32.093
342338.7544.95759.642.053126538.0944.96558.32.101
341638.7544.95759.62.048125738.1344.96558.242.098
247138.7444.95759.652.051236738.0844.96458.362.116
347138.7644.95659.712.051137138.0644.96458.312.106
345138.7544.95659.792.058136338.0544.96458.352.101
342638.7444.95659.632.054134338.1444.96458.162.094
342538.7544.95659.652.054133338.0544.96458.242.094
341538.7444.95659.692.051132338.0644.96458.332.101
345538.7544.95559.642.056132138.0644.96458.332.106
343238.7644.95559.72.053127538.0944.96458.212.1
341438.7444.95559.572.057125538.0844.96458.322.1
341138.7544.95559.692.06121538.0944.96458.22.099
347238.7644.95459.612.052232738.0944.96358.42.12
346438.7544.95459.722.055136138.0644.96358.332.105
346238.7644.95459.662.054131138.0644.96358.332.105
346138.7544.95459.532.054126138.0344.96358.262.104
343438.7544.95459.682.053124538.0944.96358.32.101
343138.7644.95359.692.055124338.1144.96358.072.093
146738.8344.95159.522.04123538.0844.96358.062.097
236738.7644.9559.482.049122538.0944.96358.242.1

Statistics to the rescue

The frequency table makes it easier to obtain an overview of the test results:

average PSNR
partition
cmpsubcmpprediadia
12361236-3-2-11234-3-2-11234
no B-frames
44.84 - 44.860331319000453330121004201
44.86 - 44.88023440633107951611127111071110117
44.88 - 44.9455221110458282193335264337302442482336332521
44.9 - 44.9245151336914651918192119222119142219192323
44.92 - 44.949843600795901821241619231722191822212115
44.94 - 44.96494370003150812521291220223316113418232733
44.96 - 44.9802011000031636133900824512
max 1 B-frame
44.82 - 44.83 0 0 0 11 6 0 0 5 2 2 1 2 2 0 2 0 0 0 11 0 0 0
44.83 - 44.84 0 0 0 14 7 0 0 7 3 3 1 2 3 2 0 0 8 0 3 3 0 0
44.84 - 44.85 0 0 0 2814 0 0 14 3 3 6 4 3 4 511 6 0 0 11 0 0
44.85 - 44.86 0 0 0 3215 2 2 13 4 4 4 6 4 5 5 3 0 11 4 0 14 0
44.86 - 44.87 0 0 0 25 7 5 4 9 4 6 2 2 4 3 4 0 3 3 6 0 0 13
44.87 - 44.88 0 0 0 19 0 10 8 1 2 2 4 3 3 4 1 0 8 0 4 6 0 1
44.88 - 44.89 0 1 0 22 1 9 13 0 4 2 4 5 3 2 310 3 0 1 8 1 0
44.89 - 44.9 1 4 7 2712 14 13 0 7 6 5 5 7 4 5 4 3 12 9 0 11 0
44.9 - 44.91 0 16 23 1827 9 9 12 8 9 7 9 7 10 7 8 15 2 13 2 3 14
44.91 - 44.9219 27 26 045 1 1 25 7 12 10 10 10 11 1219 17 3 12 15 6 0
44.92 - 44.9349 32 49 053 13 11 5321 15 21 19 19 19 1612 13 22 27 19 22 15
44.93 - 44.9429 44 34 0 7 26 24 5016 19 14 14 16 15 1320 21 16 8 10 13 19
44.94 - 44.95 1 29 29 0 1 30 23 5 8 8 9 9 8 7 1011 2 6 0 21 14 5
44.95 - 44.96 6 27 24 0 1 27 28 1 8 7 8 8 9 10 7 0 5 17 1 3 14 17
44.96 - 44.9791 14 3 0 0 49 58 115 14 16 14 14 16 1914 8 20 13 14 14 25
44.97 - 44.98 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1
44.98 - 44.99 0 2 0 0 0 1 1 0 0 0 0 0 0 0 2 0 0 0 0 0 0 2

More options

Now let's experiment with the narrowed combinations described above, using the current Debian version of MEncoder: dev-CVS--4.0.3

No B-frames

Without B-frames, I would like to check cmp = 2 or 3, subcmp=6 and [pre]dia = -1, 3 or 4. 18 combinations...

setuser timeLog PSNRImageMagick MAELog quantitizer
Min.MeanMax.SDMin.MeanMax.SDMin.MeanMax.SD
3243617m45s36.8244.9859.722.08231.635519.171114.9110.123.43970.758
3246719m40s36.8344.98859.62.09527.382519.221116.8111.323.43370.7664
3246617m51s36.8344.98259.712.0923.9519.251118.4110.423.43970.7617
3247619m31s36.8244.98359.552.08830.91519.251115.6110.623.43670.7605
3243719m21s36.8444.98659.742.09330.046519.271113.911123.43370.7641
3246317m24s36.8344.98359.72.09130.717519.311116.5110.923.43370.7637
3247318m56s36.8344.98559.72.09427.844519.361116.4111.223.43370.7635
3346730m06s36.6844.98159.732.09230.193519.411126.6111.123.44270.766
3243317m05s36.8344.98359.472.09429.592519.461114.8111.223.43370.7648
3247721m02s36.8444.98459.722.09327.898519.511121.1111.323.43470.7631
3346626m17s36.8344.97459.62.08928.855519.621115.6110.923.44770.7616
3343625m04s36.8444.97959.582.09229.761519.671114.4111.123.44370.7667
3347629m27s36.6944.97759.822.09531.267519.691126.3111.423.44370.7665
3346324m45s36.8244.97759.532.09227.133519.71116.6110.923.44570.7642
3343323m24s36.8444.97659.682.09427.214519.721114.8111.223.44470.7663
3343728m35s36.8544.97659.472.09428.33519.81115111.323.44470.7643
3347327m59s36.6944.97759.552.09429.122519.811126.3111.323.44470.765
3347732m49s36.6944.97759.542.09428.787519.871126.3111.323.44270.7669

The results are sorted according to MAE and the best PSNR are marked. I would recommend the following:

Max 1 B-frame

With up to 1 B-frame, I would like to check cmp = 1 or 2, subcmp= 2 or 3 and [pre]dia = -1, 3 or 4. 36 combinations...

setuser timeLog PSNRImageMagick MAELog quantitizer
Min.MeanMax.SDMin.MeanMax.SDMin.MeanMax.SD
3236716m34s36.4945.01658.412.11923.091508.471158.3106.723.95981.286
3233716m26s36.1945.015582.11726.5508.511200.3106.923.9681.286
3233615m20s36.4645.0157.942.11725.826508.581161.9106.723.9681.285
3236314m51s36.4845.01258.32.1224.097508.591160.3106.823.95981.287
3226722m33s36.4945.01458.392.12223.206508.641157.7106.823.95181.287
3237716m50s36.2245.01458.162.11724.714508.691197.1106.723.96181.286
3226318m09s36.4845.01258.352.12123.734508.691159.5106.823.94981.286
3223722m19s36.1845.012582.11625.372508.711202.9106.823.95281.286
3233314m41s36.4845.01257.952.11726.838508.721157.4106.723.95981.286
3223619m11s36.4545.00858.052.11924.823508.731162.4106.823.95281.285
3227723m18s36.4645.01158.012.11725.198508.761162.4106.723.95281.285
3223317m52s36.1745.01158.062.12325.449508.831199.710723.9581.288
3237315m06s36.4745.00957.972.11825.757508.851161.3106.723.95981.285
3227318m55s36.4545.00857.952.11725.434508.851164.4106.723.95181.285
3237615m46s36.6244.99758.042.11124.962509.11141.3105.723.97181.285
3227620m14s36.5944.99658.052.11224.831509.211147.2105.923.9681.284
3123720m52s36.0644.99758.022.11124.98509.291214.1106.823.90881.274
3136715m27s36.0644.99657.792.1127.796509.31213.2106.823.91581.271
3236615m29s37.244.99158.452.12622.862509.321069.3105.923.98181.294
3126721m03s36.0444.99757.912.11126.255509.331216.2106.823.90781.274
3137614m40s36.0544.99558.022.11325.141509.341213.2106.823.91681.272
3133614m16s36.0444.99458.092.11424.987509.351215.4106.923.91781.273
3136614m24s36.0644.99557.942.11225.241509.351213.9106.823.91681.272
3137715m43s36.0644.99658.132.11325.181509.361212.6106.923.91681.273
3226619m23s37.244.98958.482.12522.875509.381069.1105.923.97281.291
3123617m50s36.0544.99558.112.11225.01509.41214.2106.923.90781.273
3133715m20s36.0444.99558.022.11225.176509.411214.3106.823.91681.272
3127618m46s36.0444.99658.032.11325.06509.431215.2106.823.90881.274
3136313m41s36.1244.99357.872.11425.504509.441208.3106.923.91981.274
3126617m56s36.0344.99458.072.11324.275509.451216.5106.923.90981.274
3127721m49s36.0544.99757.872.11126.126509.451214.2106.723.90781.274
3133313m35s36.0844.99257.982.11426.106509.531212.1106.823.92181.273
3137313m57s36.1144.99157.962.11325.393509.531208.5106.823.9281.273
3123316m26s36.0944.99257.752.11126.623509.541213.9106.823.91181.272
3126316m39s36.0944.99258.052.11225.104509.591212.1106.723.9181.274
3127317m25s36.0944.99258.042.11325.534509.61213.9106.723.91181.274

The results are sorted according to MAE and the best PSNR are marked. I would recommend the following: cmp=2:subcmp=3:predia=3:dia=4. But predia=-1 would be good too.

Fine tunning

Let's test some variations based on the previous recommendations, just to check that the initial assumptions were correct.

No B-frames

setuser timeLog PSNRImageMagick MAELog quantitizer
Min.MeanMax.SDMin.MeanMax.SDMin.MeanMax.SD
reference19m34s36.8344.98859.62.09527.382519.221116.8111.323.43370.7664
last_pred116m37s36.9144.97959.642.08129.283519.011116.6109.923.44170.7557
last_pred217m43s36.8244.98659.792.08527.172518.851123.4110.223.43770.7577
last_pred421m58s36.6244.98359.642.09432.796519.671137.3111.523.43170.764
precmp116m57s36.6344.98159.622.09427.387519.711140.9111.723.43370.7639
precmp218m09s36.8144.98459.572.09230.614519.51113111.223.43370.7633
preme14m35s36.8744.98359.572.08527.4655191119.8110.323.43870.757
turbo27m12s36.8644.97459.492.06828.727519.381120.811023.44670.7221
vqcomp19m38s35.5545.01659.792.19726.634519.411252.2116.123.40390.8938
vqdiff19m34s36.8144.98659.562.08931.141519.261122.411123.43170.7626
vqscale19m45s36.4844.97259.582.09930.458520.091149.3111.623.44470.7838

Using last_pred=3:precmp=3:preme=2:(turbo) was good. I'm still amazed that using the turbo in n-pass mode can improve quality while saving time. Not using vqdiff=2 or vqscale=2 was also good, but leaving vqcomp to its default would have been better. The new reference in this case is:
mbd=2:v4mv:trell:cbp:mv0:last_pred=3:preme=2:precmp=3:cmp=2:subcmp=6:predia=3:dia=4:vpass=1/3/3:(turbo)

There is one new thing I would like to test, which only works without B-frames: using chroma on *cmp functions. This means using precmp=259:cmp=258:subcmp=262. The new vqcomp=0.5 selection is in use and I also check a 2-pass encode without turbo:

setuser timeLog PSNRLog quantitizer
Min.MeanMax.SDMin.MeanMax.SD
xcmp26m42s35.645.045602.21423.39790.8974
xcmp 2-pass, no turbo25m40s
36.145.04359.982.23523.38680.9764

Including chroma was a great idea: the PSNR is reaching even higher than with B-frames. The 3 passes didn't make much difference but they were not much expensive either, let's stick to 2-pass to keep it simple. The winner is:
mbd=2:v4mv:trell:cbp:mv0:last_pred=3:preme=2:precmp=259:cmp=258:subcmp=262:predia=3:dia=4:vpass=1/2

Max 1 B-frame

setuser timeLog PSNRImageMagick MAELog quantitizer
Min.MeanMax.SDMin.MeanMax.SDMin.MeanMax.SD
reference22m31s36.4945.01658.412.11923.091508.471158.3106.723.95981.286
last_pred119m59s37.2344.99458.52.12122.879509.11066.3105.723.98481.293
last_pred220m55s37.2444.99758.392.11922.925509.011066.6105.723.98581.293
last_pred424m37s36.1845.01558.472.12423.25508.661201.310723.96181.287
precmp121m17s36.245.01258.172.12124.636508.721199.1107.223.95981.29
precmp221m52s36.2145.01358.392.11724.03508.641196.9106.823.9681.287
preme19m57s36.5145.01858.532.12123.16508.311158.3106.723.96581.288
turbo30m52s36.6445.00358.142.10124.422508.851140.110623.99581.28
vqcomp22m31s36.6645.02858.422.1724.049508.631142.6108.423.95881.321
vqdiff22m31s36.1245.01558.412.11823.091508.481209.7106.723.9681.283
vqscale22m49s36.2745.01258.542.13522.805508.771192.2106.623.93981.288

Using last_pred=3:precmp=3:(turbo) was good. Not using vqdiff=2 or vqscale=2 was also good, but leaving vqcomp to its default would have been better. It looks like preme=1 would have been better too, but preme=2 is required to take advantage of predia. The new reference I would use is:
mbd=2:v4mv:trell:cbp:mv0:last_pred=3:preme=2:precmp=3:cmp=2:subcmp=3:predia=3:dia=4:vmax_b_frames=1:vpass=1/3/3:(turbo)

There are a few more tests to make: using vb_strategy=1 and higher values of bidir_refine, which are only relevant for B-frames. On top of that, vb_strategy gives more weight to the first pass so it's worth trying to remove the turbo and test only 2 pass...

setuser timeLog PSNRLog quantitizer
Min.MeanMax.SDMin.MeanMax.SD
bidir_refine=123m45s35.4345.03358.152.1823.95291.323
bidir_refine=226m10s36.1145.03658.522.18323.95181.326
bidir_refine=329m30s35.4445.03658.422.18223.94891.322
bidir_refine=431m9s36.145.03458.522.18223.94981.321
vb_strategy=120m21s36.2345.05959.72.17823.70191.039
vb_strategy=1, no turbo28m7s36.3545.05459.572.15223.71291.018
vb_strategy=1, no turbo, 2-pass18m51s36.3745.0659.722.18923.67481.037
vb_strategy=225m54s36.244559.662.19923.41480.8989
vb_strategy=1, max 2 B-frames20m50s36.4445.04459.672.16823.79181.058
bidir_refine=2:vb_strategy=121m30s36.2445.0659.72.17823.69891.043
bidir_refine=2:vb_strategy=1, no turbo, 2-pass19m21s
36.345.06959.662.19623.66881.04
vb_strategy=1, no turbo, 2-pass and chroma *cmp25m45s
36.3545.08159.962.19823.67381.036
bidir_refine=2:vb_strategy=1, no turbo, 2-pass and chroma *cmp26m18s
36.3945.08859.992.20423.66781.039

bidir_refine and vb_strategy sure did improve the PSNR. Removing the turbo was necessary to make full use of vb_strategy, so I ended up with only 2 passes instead of 3. You can also notice an attempt to use chroma on motion estimation even though the documentation mentions that it doesn't work correctly with B-frames: it worked and it paid off (but it's expensive!). The winner is:
mbd=2:v4mv:trell:cbp:mv0:last_pred=3:preme=2:precmp=3:cmp=2:subcmp=3:predia=3:dia=4:bidir_refine=2:vmax_b_frames=1:vb_strategy=1:vpass=1/2
And if you dare, you can try with precmp=259:cmp=258:subcmp=259.

Comparisons

Now I would like to show some curiosity for the expensive qns and qprd as usual, but also try things like dia=-10 and even XviD or x264. The parameters used to test XviD or x264 are taken from the best recommended options of the MPlayer manual.

setuser timeLog PSNRImageMagick MAELog quantitizer
Min.MeanMax.SDMin.MeanMax.SDMin.MeanMax.SD
ref025m17s36.145.04359.982.23522.803517.451210.4114.323.38680.9764
ref0dia25m56s35.3645.04359.982.25422.384517.711299.1116.623.38890.9912
ref0qns144m27s35.545.05359.912.2522.663515.241248115.323.26680.9491
ref0qns247m16s35.4245.04860.022.26422.671515.631255.6115.823.30980.966
ref0qns370m27s35.445.03960.012.2622.534515.871256.111523.33180.9652
ref0qprd27m13s35.9245.03859.80.278224.315517.271214.1113.423.58921.150.07696
ref119m31s36.345.06959.662.19628.407513.21176.1113.723.66881.04
ref1dia19m58s36.1245.06159.622.20730.582513.861201.6115.323.67781.051
ref1qns137m44s35.7745.05459.512.19129.826511.521216.3113.323.56281.019
ref1qns239m50s3645.05559.622.20832.545511.771186.211423.60481.03
ref1qns359m56s36.1645.05359.662.2129.708511.911168113.623.6281.027
ref1qprd21m31s36.7745.05459.460.461928.693513.061127.111123.86717.260.2635
x26430m38s-45.641--44.214484.031062.2105.2----
xvid7m58s37.6144.50255.692.09938.414521.771086.2109.824.65681.399

XviD was amazingly fast. It has the lowest quality of the test but it might be possible to push it up, given enough testing. x264 is excellent (and slow), but I had problems to play it with many media players and right now it doesn't look like a suitable alternative for archiving. dia=10 was not actually interesting.

To evaluate the effect of qns and qprd, I can't simply rely on psnr. I use (at last) my old script to build a mosaic of the frames and ease up a visual comparison. My conclusion is that qns=1 might have helped when not using B-frames, but in general it was best without qprd or qns for this type of movie. Another remark is that using B-frames gave better results than not using any and went faster, but I noticed during the visual comparison some color shifts: like a green or red stain around detailed areas. My guess is that it would have helped to use the chroma *cmp even with B-frames. An other thing I would like to check: is it OK to allow up to 2 B-frames when vb_strategy is set to 1 and limits it?...

setuser timeLog PSNRLog quantitizer
Min.MeanMax.SDMin.MeanMax.SD
ref119m31s36.345.06959.662.19623.66881.04
ref1 with chroma *cmp26m18s
36.3945.08859.992.20423.66781.039
ref1 with max 2 B-frames and chroma *cmp25m40s
36.7945.0759.962.19123.76371.058

It seams to be best with max 1 consecutive B-frames. And using the chroma *cmp didn't help removing the colored stains.

Conclusion

The winner is:
mbd=2:v4mv:trell:cbp:mv0:last_pred=3:preme=2:precmp=3:cmp=2:subcmp=3:predia=3:dia=4:bidir_refine=2:vmax_b_frames=1:vb_strategy=1:vpass=1/2.

When rendering the full movie at 900 kbits, the short test sequence receives an higher average bitrate and the end result is near perfect on a CRT TV. Looking at the quantitizer histogram of the full movie, there is no problem setting vqmin=1 and vqmax=6. It is in those conditions that qprd and qns are able to make a positive difference. To obtain an encode that looks very good scaled at 200% on a LCD screen, I had to use up to 1100 kbits with qprd:qns=1:vqmin=1:vqmax=5. Also note that, if you have enough patience and enough bits, qns=3 will make most ringing disapear.

The next section will be about filtering. Encoding the original artifacts is probably not what we want and, regarding that point, home DV movies are very different than modern DVD productions. Testing on DVD sources will therefore end here, they have been useful as a high quality source that allowed me to defer filtering until now. But if you have already been burning your videos to DVD and didn't keep the raw DV sources, the next section will not help you get rid of mpeg2 artifacts. My suggestion is to at least filter out ringing and blocking before encoding, using a postprocessing filter: pp=hb/vb/dr. Or simply use pp=ac. Insert it after cropping, don't use it after the denoise filter.

In case you wondered, the complete command to encode the first pass is: mencoder stream.dump -oac copy -ovc lavc -lavcopts vbitrate=1100:mbd=2:v4mv:trell:cbp:mv0:last_pred=3:preme=2:precmp=3:cmp=2:subcmp=3:predia=3:dia=4:bidir_refine=2:vmax_b_frames=1:vb_strategy=1:qns=1:qprd:vqmin=1:vqmax=5:vpass=1 -vf crop=704:416:8:80,scale=-1:288,pp=hb/vb/dr,hqdn3d=0:0:4 -o /dev/null

And for the second pass: mencoder stream.dump -oac copy -ovc lavc -lavcopts vbitrate=1100:mbd=2:v4mv:trell:cbp:mv0:last_pred=3:preme=2:precmp=3:cmp=2:subcmp=3:predia=3:dia=4:bidir_refine=2:vmax_b_frames=1:qns=1:qprd:vqmin=1:vqmax=5:vpass=2:psnr:autoaspect -vf crop=704:416:8:80,scale=-1:288,pp=hb/vb/dr,hqdn3d=0:0:4 -o test.avi