Data Wrangling with python Datatable: Aggregate columns into new columns based on prefix

Source Data

Datatable docs

# import libraries
from collections import defaultdict
from datatable import dt, f

df = dt.Frame({'sn': [1, 2, 3],
                'C1-1': [4, 2, 1],
               'C1-2': [3, 2, 2],
               'C1-3': [5, 0, 0],
               'H2-1': [4, 2, 0],
               'H2-2': [1, 0, 2],
               'K3-1': [4, 1, 1],
               'K3-2': [2, 2, 2]})

   |    sn   C1-1   C1-2   C1-3   H2-1   H2-2   K3-1   K3-2
   | int32  int32  int32  int32  int32  int32  int32  int32
-- + -----  -----  -----  -----  -----  -----  -----  -----
 0 |     1      4      3      5      4      1      4      2
 1 |     2      2      2      0      2      0      1      2
 2 |     3      1      2      0      0      2      1      2
[3 rows x 8 columns]

Create a dictionary where the key is the prefix, and the values are the columns that start with the prefix.

mapping = defaultdict(list)
for entry in df.names[1:]:
    key = entry.split("-")[0]
    key = f"total_{key}" # f-strings
    mapping[key].append(f[entry]) # f-expressions

Create a dictionary containing f-expressions, that are essentially the rowsum of the values in mapping:

mapping = {key: dt.rowsum(value) 
           for key, value in mapping.items()}

{'total_C1': Expr:rowsum([FExpr<f['C1-1']>, FExpr<f['C1-2']>, FExpr<f['C1-3']>]; ),
 'total_H2': Expr:rowsum([FExpr<f['H2-1']>, FExpr<f['H2-2']>]; ),
 'total_K3': Expr:rowsum([FExpr<f['K3-1']>, FExpr<f['K3-2']>]; )}

Aggregate to create new columns


   |    sn  total_C1  total_H2  total_K3
   | int32     int32     int32     int32
-- + -----  --------  --------  --------
 0 |     1        12         5         6
 1 |     2         4         2         3
 2 |     3         3         2         3
[3 rows x 4 columns]