Introduction to our VDOM.pm & vdom-webkit cluster ---- Introduction to our {{#x|VDOM.pm}} & {{#x|vdom-webkit}} cluster ☺{{#author|agentzh@yahoo.cn}}☺ {{#author|章亦春 (agentzh)}} {{#date|2009.4}} ---- {{#v|VDOM}} ➥ {{#x|Visual}} DOM ➥ DOMs with {{#ci|vision information}} ---- {{#kw|window}} {{#kw|location}}={{#x|"http://foo.bar.com/index.html"}} {{#kw|innerHeight}}=802 {{#kw|innerWidth}}=929 {{#kw|outerHeight}}=943 {{#kw|outerWidth}}=1272 { {{#kw|document}} {{#kw|width}}=914 {{#kw|height}}=5119 { {{#v|...}} } } ---- {{#kw|BODY}} {{#kw|offsetX}}=0 {{#kw|offsetY}}=0 {{#kw|offsetWidth}}=914 {{#kw|offsetHeight}}=5119 {{#kw|fontFamily}}=\"Helvetica,Arial,sans-serif\" {{#kw|fontSize}}=\"12px\" {{#kw|fontStyle}}=\"normal\" {{#kw|fontWeight}}=\"400\" {{#kw|color}}=\"rgb(0, 0, 0)\" {{#kw|backgroundColor}}=\"rgb(255, 255, 255)\" { {{#x|"\\n "}} {{#kw|DIV}} {{#kw|id}}=\"append_parent\" {{#kw|offsetX}}=0 {{#kw|offsetY}}=0 {{#kw|offsetHeight}}=0 {{#kw|backgroundColor}}=\"transparent\" { {{#x|"首页\\n\\n"}} {{#v|...}} } {{#x|"\\n "}} } ---- {{#kw|FONT}} {{#kw|color}}=\"rgb(255, 0, 0)\" { {{#kw|B}} {{#kw|fontWeight}}=\"401\" { {{#x|"购物"}} } } ---- \"Why {{#ci|another}} language?\" \"Why {{#x|not}} just borrow HTML or XML's syntax?\" ---- {{#cm|✓}} We want to keep VDOM dump size {{#ci|small}}. {{#cm|✓}} We want to keep VDOM dump {{#ci|unambiguous}}. {{#cm|✓}} We want to make VDOM more {{#x|human-readable}} and more {{#x|human-writable}}. (Yeah, XML/HTML's syntax is very {{#i|cumbersome}}.) {{#cm|✓}} We want to make VDOM {{#i|parsers}} & {{#i|dumper}} {{#ci|trivial}} to implement and verify. (tens of lines of Perl for example ;)) ---- {{#x|☺}} We've already made both Mozilla {{#x|Gecko}} and Apple {{#x|WebKit}} {{#i|emit}} VDOMs ---- {{img src="#" width="0" height="0"}} {{img src="images/gen-vdom.png" width="536" height="547"}} ---- {{#cm|# Generate VDOM from the command line:}} {{#v|$}} {{#ci|vdomwebkit}} --proxy proxy.cn:1080 -o sina.vdom \\ http://www.sina.com.cn {{#cm|# Or access our vdom-webkit FastCGI server directly by HTTP:}} {{#v|$}} curl 'http://vdom.cn.yahoo.com/=/vdom?{{#x|url=www.sina.com.cn}}' \\ > sina.vdom ---- {{#cm|# The VDOM dump is much smaller than the original HTML:}} {{#v|$}} ls -lh {{#ci|sina.vdom}} -rw------- 1 agentz agentz {{#x|278K}} 2009-04-10 10:34 sina.vdom {{#v|$}} ls -lh {{#ci|sina.html}} -rw-r--r-- 1 agentz agentz {{#x|400K}} 2009-04-10 10:34 sina.html ---- {{#cm|✓}} Now {{#ci|Perl}} enjoys {{#x|very powerful DOMs}} as good as those in JavaScript. ---- {{#kw|use}} VDOM; {{#kw|open}} {{#kw|my}} {{#v|$in}}, {{#x|"sina.vdom"}}; {{#kw|my}} {{#v|$win}} = VDOM::Window->new->parse_file({{#v|$in}}); {{#kw|my}} {{#v|$body}} = {{#v|$win}}->document->body; {{#kw|for}} {{#kw|my}} {{#v|$child}} ({{#v|$body}}->childNodes) { print {{#v|$child}}->tagName; print {{#v|$child}}->offsetX; print {{#v|$child}}->offsetHeight; print {{#v|$child}}->color; print {{#v|$child}}->fontFamily; {{#v|...}} } ---- print {{#v|$child}}->nextSibling; {{#v|$win}}->document->getElementById({{#x|"foo"}}); {{#cm|# These are Firefox 3.1 DOM methods, we have too ;)}} print {{#v|$child}}->previousElementSibling; print {{#v|$child}}->firstElementChild; print $child->parentNode; print {{#kw|join}} {{#x|' '}}, {{#kw|map}} { {{#v|$$_}}->href . {{#x|': '}} . {{#v|$$_}}->textContent } {{#v|$child}}->getElmenetsByTagName({{#x|"A"}}); ---- {{img src="#" width="0" height="0"}} {{img src="images/vdom-pm.png" width="449" height="1176"}} ---- {{img src="#" width="0" height="0"}} {{img src="images/vdom-pm2.png" width="373" height="285"}} ---- {{#cm|☺}} {{#i|Debug}} our Perl code from within {{#ci|Firefox}} via our {{#x|Visual DOM}} extension ---- {{img src="#" width="0" height="0"}} {{img src="images/visualdom-ch.png" width="863" height="636"}} ---- {{img src="#" width="0" height="0"}} {{img src="images/visualdom-ch-cfg.png" width="1016" height="762"}} ---- {{img src="#" width="0" height="0"}} {{img src="images/between-ff-perl.png" width="723" height="629"}} ---- {{img src="#" width="0" height="0"}} {{img src="images/visualdom-lh.png" width="1016" height="762"}} ---- {{img src="#" width="0" height="0"}} {{img src="images/visualdom-lh-cfg.png" width="863" height="636"}} ---- {{#cm|☺}} The {{#ci|qt-webkit port}} of our {{#x|Visual DOM}} extension is now under {{#i|active}} development ;) ---- {{#cm|☺}} Put everything into a {{#ci|cluster}}. ---- {{img src="#" width="0" height="0"}} {{img src="images/cluster-arch.png" width="735" height="530"}} ---- {{img src="#" width="0" height="0"}} {{img src="images/vdomwebkit-farm.png" width="291" height="453"}} ---- {{img src="#" width="0" height="0"}} {{img src="images/proxy-guts2.png" width="384" height="339"}} ---- {{img src="#" width="0" height="0"}} {{img src="images/prefetcher-guts.png" width="805" height="573"}} ---- {{img src="#" width="0" height="0"}} {{img src="images/resty-guts.png" width="1072" height="440"}} ---- {{#x|Acknowledgements}} {{#x|☺}} haibo++ persuaded me to believe that the {{#ci|separation}} of browser rendering engines and our hunter extractors via VDOM dumping could give rise to {{#ci|lots}} of benefits. {{#x|☺}} jianingy++ effectively {{#ci|fired}} the great WebKit craze in our team. {{#x|☺}} xunxin++ {{#ci|ported}} Visual DOM extension's JavaScript VDOM dumper to qt-webkit C++ and did most of the hard work in {{#ci|vdom-webkit}}. {{#x|☺}} mingyou++ shared a great deal of his {{#ci|knowledge}} of the WebKit internals with us and also gave very good suggestions for the slides you're browsing. ---- ☺ {{#ci|Any questions}}? ☺ ----