Abstract
A cross domain multistream classification is a challenging problem calling for fast domain adaptations to handle different but related streams in never-ending and rapidly changing environments. Notwithstanding that existing multistream classifiers assume no labeled samples in the target stream, they still incur expensive labeling costs since they require fully labeled samples of the source stream. This article aims to attack the problem of extreme label shortage in the cross domain multistream classification problems where only very few labeled samples of the source stream are provided before process runs. Our solution, namely, Learning Streaming Process from Partial Ground Truth (LEOPARD), is built upon a flexible deep clustering network where its hidden nodes, layers, and clusters are added and removed dynamically with respect to varying data distributions. A deep clustering strategy is underpinned by a simultaneous feature learning and clustering technique leading to clustering-friendly latent spaces. A domain adaptation strategy relies on the adversarial domain adaptation technique where a feature extractor is trained to fool a domain classifier by classifying source and target streams. Our numerical study demonstrates the efficacy of LEOPARD where it delivers improved performances compared to prominent algorithms in 15 of 24 cases. Source codes of LEOPARD are shared in https://github.com/wengweng001/LEOPARD.git to enable further study.
Original language | English |
---|---|
Pages (from-to) | 6839-6850 |
Number of pages | 12 |
Journal | IEEE Transactions on Neural Networks and Learning Systems |
Volume | 34 |
Issue number | 10 |
DOIs | |
Publication status | Published - 1 Oct 2023 |
Externally published | Yes |
Keywords
- Concept drifts
- data streams
- incremental learning
- multistream classification
- transfer learning