在Datafusion里有逻辑计划LogicalPlan和物理计划ExecutionPlan。两者都提供了proto的编解码支持。所以从proto入手也能帮助我们更好地理解。 直接找到datafusion.proto文件,找到proto当中对应的LogicalPlanNode和PhysicalPlanNode。

在proto文件当中更加明显,两者都是递归嵌套的结构

// logical plan  
// LogicalPlan is a nested type  
message LogicalPlanNode {  
  oneof LogicalPlanType {  
    ListingTableScanNode listing_scan = 1;  
    ProjectionNode projection = 3;  
    SelectionNode selection = 4;  
    LimitNode limit = 5;  
    AggregateNode aggregate = 6;  
    JoinNode join = 7;  
    SortNode sort = 8;  
    RepartitionNode repartition = 9;  
    EmptyRelationNode empty_relation = 10;  
    CreateExternalTableNode create_external_table = 11;  
    ExplainNode explain = 12;  
    WindowNode window = 13;  
    AnalyzeNode analyze = 14;  
    CrossJoinNode cross_join = 15;  
    ValuesNode values = 16;  
    LogicalExtensionNode extension = 17;  
    CreateCatalogSchemaNode create_catalog_schema = 18;  
    UnionNode union = 19;  
    CreateCatalogNode create_catalog = 20;  
    SubqueryAliasNode subquery_alias = 21;  
    CreateViewNode create_view = 22;  
    DistinctNode distinct = 23;  
    ViewTableScanNode view_scan = 24;  
    CustomTableScanNode custom_scan = 25;  
    PrepareNode prepare = 26;  
    DropViewNode drop_view = 27;  
    DistinctOnNode distinct_on = 28;  
    CopyToNode copy_to = 29;  
    UnnestNode unnest = 30;  
    RecursiveQueryNode recursive_query = 31;  
    CteWorkTableScanNode cte_work_table_scan = 32;  
    DmlNode dml = 33;  
  }  
}

LogicalPlanNode内枚举了所有的逻辑计划的节点,每个具体类型又会通过LogicalPlanNode进行嵌套。LogicalExprNode则是和表达式相对应的递归嵌套结构。

message LogicalExprNode {  
  oneof ExprType {  
    // column references  
    datafusion_common.Column column = 1;  
  
    // alias  
    AliasNode alias = 2;  
  
    datafusion_common.ScalarValue literal = 3;  
  
    // binary expressions  
    BinaryExprNode binary_expr = 4;  
  
  
    // null checks  
    IsNull is_null_expr = 6;  
    IsNotNull is_not_null_expr = 7;  
    Not not_expr = 8;  
  
    BetweenNode between = 9;  
    CaseNode case_ = 10;  
    CastNode cast = 11;  
    NegativeNode negative = 13;  
    InListNode in_list = 14;  
    Wildcard wildcard = 15;  
    // was  ScalarFunctionNode scalar_function = 16;  
    TryCastNode try_cast = 17;  
  
    // window expressions  
    WindowExprNode window_expr = 18;  
  
    // AggregateUDF expressions  
    AggregateUDFExprNode aggregate_udf_expr = 19;  
  
    // Scalar UDF expressions  
    ScalarUDFExprNode scalar_udf_expr = 20;  
  
    // GetIndexedField get_indexed_field = 21;  
  
    GroupingSetNode grouping_set = 22;  
  
    CubeNode cube = 23;  
  
    RollupNode rollup = 24;  
  
    IsTrue is_true = 25;  
    IsFalse is_false = 26;  
    IsUnknown is_unknown = 27;  
    IsNotTrue is_not_true = 28;  
    IsNotFalse is_not_false = 29;  
    IsNotUnknown is_not_unknown = 30;  
    LikeNode like = 31;  
    ILikeNode ilike = 32;  
    SimilarToNode similar_to = 33;  
  
    PlaceholderNode placeholder = 34;  
  
    Unnest unnest = 35;  
  
  }  
}

PhysicalPlanNode和PhysicalExprNode则是对应于物理计划的节点和表达式,和逻辑计划的递归嵌套结构是一样的,这里就不多介绍了。