How to read Spark SQL's toDebugString output?
what those #number are near column names
what true or false mean near Aggregate or Sort (eg. Sort [l_returnflag#404 ASC,l_linestatus#405 ASC], true)
what BuildLeft or BuildRight mean on ShuffledHashJoin
why there are !OutputFaker lines when quering Parquet databases and what they do mean
I can't find out how to read Spark SQL's onDebugString output.
I don't understand
what those #number are near column names
what true or false mean near Aggregate or Sort (eg. Sort [l_returnflag#404 ASC,l_linestatus#405 ASC], true)
what BuildLeft or BuildRight mean on ShuffledHashJoin
why there are !OutputFaker lines when quering Parquet databases and what they do mean
Here are links to toDebugString outputs for the same query on two different database formats.
Avro http://pastebin.com/BPwwfdzz
Parquet http://pastebin.com/pZNfCHPc
コメント