1. Support Iceberg tables 2. Read Parquet files through Arrow 3. Assume UTC if no timezone info in data files 4. Add `split` and `stratify` methods to `Bag` class for training and testing data split 5. Remove Hadoop dependency 7. Pause publishing spark module as spark depends on hadoop, which runs only on java 8/11 8. Bug fixes