TY - GEN
T1 - Scalable Teacher-Forcing Networks under Spark Environments for Large-Scale Streaming Problems
AU - Za'in, Choiru
AU - Ashfahani, Andri
AU - Pratama, Mahardhika
AU - Lughofer, Edwin
AU - Pardede, Eric
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/5
Y1 - 2020/5
N2 - Large-scale data streams remains an open issue in the existing literature. It features a never ending information flow, mostly going beyond the capacity of a single processing node. Nonetheless, algorithmic development of large-scale streaming algorithms under distributed platforms faces major challenge due to the scalability issue. The network complexity exponentially grows with the increase of data batches, leading to an accuracy loss if the model fusion phase is not properly designed. A largescale streaming algorithm, namely Scalable Teacher Forcing Network (ScatterNet), is proposed here. ScatterNet has an elastic structure to handle the concept drift in the local scale within the data batch or in the global scale across batches. It is built upon the teacher forcing concept providing a short-term memory aptitude. ScatterNet features the data-free model fusion approach which consists of the zero-shot merging mechanism and the online model selection. Our numerical study demonstrates the moderate improvement of prediction accuracy by ScatterNet while gaining competitive advantage in terms of the execution time compared to its counterpart.
AB - Large-scale data streams remains an open issue in the existing literature. It features a never ending information flow, mostly going beyond the capacity of a single processing node. Nonetheless, algorithmic development of large-scale streaming algorithms under distributed platforms faces major challenge due to the scalability issue. The network complexity exponentially grows with the increase of data batches, leading to an accuracy loss if the model fusion phase is not properly designed. A largescale streaming algorithm, namely Scalable Teacher Forcing Network (ScatterNet), is proposed here. ScatterNet has an elastic structure to handle the concept drift in the local scale within the data batch or in the global scale across batches. It is built upon the teacher forcing concept providing a short-term memory aptitude. ScatterNet features the data-free model fusion approach which consists of the zero-shot merging mechanism and the online model selection. Our numerical study demonstrates the moderate improvement of prediction accuracy by ScatterNet while gaining competitive advantage in terms of the execution time compared to its counterpart.
KW - Distributed Learning
KW - Large-scale data stream analytics
KW - Lifelong learning
KW - Spark
UR - http://www.scopus.com/inward/record.url?scp=85088104132&partnerID=8YFLogxK
U2 - 10.1109/EAIS48028.2020.9122752
DO - 10.1109/EAIS48028.2020.9122752
M3 - Conference contribution
AN - SCOPUS:85088104132
T3 - IEEE Conference on Evolving and Adaptive Intelligent Systems
BT - 2020 IEEE International Conference on Evolving and Adaptive Intelligent Systems, EAIS 2020 - Proceedings
A2 - Castellano, Giovanna
A2 - Castiello, Ciro
A2 - Mencar, Corrado
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 12th IEEE International Conference on Evolving and Adaptive Intelligent Systems, EAIS 2020
Y2 - 27 May 2020 through 29 May 2020
ER -