Recently, I have been asked to form a table, the schema is very simple:

pid: STRING (NULLABLE)
rec_list1: STRING (NULLABLE)
rec_list2: STRING (NULLABLE)

It’s a simple mapping from pid to its recommending list. Note that they are all strings. However, My boss wants me to convert to something like this:

pid: STRING (NULLABLE)
left: STRUCT
  pid: NUMBERIC (NULLABLE)
  cat_id: NUMBERIC (NULLABLE)
  cat_name: STRING (NULLABLE)
right: STRUCT
  pid: NUMBERIC (NULLABLE)
  cat_id: NUMBERIC (NULLABLE)
  cat_name: STRING (NULLABLE)

It seems to be easy but took me some time because I am green on advanced usage of sql. Therefore, I am introducing UNNEST, ARRAY, STRUCT, etc to me.

  • UNNEST

    Some documentation from google that I can find. UNNEST will break ARRAY into rows and it’s a powerful breaker. Sometimes there are array of structs, you can use UNNEST to break some and join; Other usage of UNNEST I saw before is to using it to create additional info and joining the additional info into table to filter accordingly. UNNEST can be used together as CROSS JOIN

  • ARRAY

    Again some documentation from google that I can find. ARRAY doesn’t support fancy JOIN inside of it. Something like this below is forbidden by google

      select array(
        select as struct pid, cat
        from unnest(some_array)
        left join some_table
        using some_column
        order by some_other_column
      )
    

    LEFT JOIN is prohibited inside array, google will error you do an efficient join outside. However INNER JOIN/JOIN will work.

  • STRUCT

    It’s like an object that you can put a lot of things inside, but remember it’s hard to do ORDER BY, GROUP all related operations touching the field inside. I would recommend prepare the data before actually forming the STRUCT.

Additionally, use Common Table Expression (CTE) wisely. As I mentioned earlier, avoid doing sorting and grouping involves STRUCT. My trick would be join tables using sub CTE before hand and form structs and only do simple sort and group when necessary.