Yazar "Vo, Bay" seçeneğine göre listele
Listeleniyor 1 - 5 / 5
Sayfa Başına Sonuç
Sıralama seçenekleri
Öğe An Approach for Incremental Mining of Clickstream Patterns as a Service Application(Ieee Computer Soc, 2023) Huynh, Huy M.; Vo, Bay; Oplatkova, Zuzana K.; Pedrycz, WitoldSequential pattern mining in general and one particular form, clickstream pattern mining, are data mining topics that have recently attracted attention due to their potential applications of discovering useful patterns. However, in order to provide them as real-world service applications, one issue that needs to be addressed is that traditional algorithms often view databases as static. In reality, databases often grow over time and invalidate parts of the previous results after updates, forcing the algorithms to rerun from scratch on the updated databases to obtain updated frequent patterns. This can be inefficient as a service application due to the cost in terms of resources, and the returning of results to users can take longer when the databases get bigger. The response time can be shortened if the algorithms update the results based on incremental changes in databases. Thus, we propose PF-CUP (pre-frequent clickstream mining using pseudo-IDList), an approach towards incremental clickstream pattern mining as a service. The algorithm is based on the pre-large concept to maintain and update results and a data structure called a pre-frequent hash table to maintain the information about patterns. The experiments completed on different databases show that the proposed algorithm is efficient in incremental clickstream pattern mining.Öğe An Approach to Semantic-Aware Heterogeneous Network Embedding for Recommender Systems(Ieee-Inst Electrical Electronics Engineers Inc, 2023) Pham, Phu; Nguyen, Loan T. T.; Nguyen, Ngoc-Thanh; Pedrycz, Witold; Yun, Unil; Lin, Jerry Chun-Wei; Vo, BayRecent studies on heterogeneous information network (HIN) embedding-based recommendations have encountered challenges. These challenges are related to the data heterogeneity of the associated unstructured attribute or content (e.g., text-based summary/description) of users and items in the context of HIN. In order to address these challenges, in this article, we propose a novel approach of semantic-aware HIN embedding-based recommendation, called SemHE4Rec. In our proposed SemHE4Rec model, we define two embedding techniques for efficiently learning the representations of both users and items in the context of HIN. These rich-structural user and item representations are then used to facilitate the matrix factorization (MF) process. The first embedding technique is a traditional co-occurrence representation learning (CoRL) approach which aims to learn the co-occurrence of structural features of users and items. These structural features are represented for their interconnections in terms of meta-paths. In order to do that, we adopt the well-known meta-path-based random walk strategy and heterogeneous Skip-gram architecture. The second embedding approach is a semantic-aware representation learning (SRL) method. The SRL embedding technique is designed to focus on capturing the unstructured semantic relations between users and item content for the recommendation task. Finally, all the learned representations of users and items are then jointly combined and optimized while integrating with the extended MF for the recommendation task. Extensive experiments on real-world datasets demonstrate the effectiveness of the proposed SemHE4Rec in comparison with the recent state-of-the-art HIN embedding-based recommendation techniques, and reveal that the joint text-based and co-occurrence-based representation learning can help to improve the recommendation performance.Öğe Efficient strategies for spatial data clustering using topological relations(Springer, 2025) Nguyen, Trang T. D.; Nguyen, Loan T. T.; Bui, Quang-Thinh; Duy, Le Nhat; Pedrycz, Witold; Vo, BayUsing topology in data analysis is a promising new field, and recently, it has attracted numerous researchers and played a vital role in both research and application. This study explores the burgeoning field of topology-based data analysis, mainly focusing on its application in clustering algorithms within data mining. Our research addresses the critical challenges of reducing execution time and enhancing clustering quality, which includes decreasing the dependency on input parameters - a notable limitation in current methods. We propose five innovative strategies to optimize clustering algorithms that utilize topological relationships by combining solutions of expanding points fewer times, merging clusters, and using a jump to increase the radius value according to the nearest neighbor distance array index. These strategies aim to refine clustering performance by improving algorithmic efficiency and the quality of clustering outcomes. This approach elevates the standard of cluster analysis and contributes significantly to the evolving landscape of data mining and analysis.Öğe Uncertainty Oriented-Incremental Erasable Pattern Mining Over Data Streams(Institute of Electrical and Electronics Engineers Inc., 2025) Kim, Hanju; Cho, Myungha; Kim, Hyeonmo; Baek, Yoonji; Lee, Chanhee; Ryu, Taewoong; Kim, Heonho; Park, Seungwan; Kim, Doyoon; Kim, Doyoung; Kim, Sinyoung; Vo, Bay; Lin, Jerry Chun-Wei; Pedrycz, Witold; Yun, UnilIn a manufacturing factory, product lines are organized by several constituents and exhibit a profit value, i.e., income from products. Erasable patterns are less profitable patterns whose gain, i.e., the sum of product profits, does not exceed a user-defined threshold. Mining erasable patterns provides the necessary information to users who want to increase profits by erasing less profitable patterns. There are requirements for a method which efficiently manages uncertain databases in incremental environments to identify erasable patterns that consider uncertainty. Because our novel technique uses a list structure, it is more efficient at finding erasable patterns from incremental databases. Moreover, accumulated stream data should be handled efficiently to identify new useful patterns in both additional data and the existing data. In this article, an algorithm using a list-based structure is proposed to extract erasable patterns containing valuable knowledge from uncertain databases in real time with effective and productive performance. In order to derive erasable patterns from continuously accumulated stream databases, the structure efficiently manages the information gathered from the previous database. Extensive performance and pattern quality evaluations were conducted using real and synthetic datasets. The results show that the algorithm performs up to seven times faster than state-of-the-art erasable pattern mining algorithms on real datasets and scales adeptly on synthetic datasets while delivering reliable and significant result patterns. © 2013 IEEE.Öğe Uncertainty oriented-incremental erasable pattern mining over data streams(IInstitute of electrical and electronics engineers inc., 2024) Kim, Hanju; Cho, Myungha; Kim, Hyeonmo; Baek, Yoonji; Lee, Chanhee; Ryu, Taewoong; Kim, Heonho; Park, Seungwan; Kim, Doyoon; Kim, Doyoung; Kim, Sinyoung; Vo, Bay; Lin, Jerry Chun-Wei; Pedrycz, Witold; Yun, UnilIn a manufacturing factory, product lines are organized by several constituents and exhibit a profit value, i.e., income from products. Erasable patterns are less profitable patterns whose gain, i.e., the sum of product profits, does not exceed a user-defined threshold. Mining erasable patterns provides the necessary information to users who want to increase profits by erasing less profitable patterns. There are requirements for a method which efficiently manages uncertain databases in incremental environments to identify erasable patterns that consider uncertainty. Because our novel technique uses a list structure, it is more efficient at finding erasable patterns from incremental databases. Moreover, accumulated stream data should be handled efficiently to identify new useful patterns in both additional data and the existing data. In this article, an algorithm using a list-based structure is proposed to extract erasable patterns containing valuable knowledge from uncertain databases in real time with effective and productive performance. In order to derive erasable patterns from continuously accumulated stream databases, the structure efficiently manages the information gathered from the previous database. Extensive performance and pattern quality evaluations were conducted using real and synthetic datasets. The results show that the algorithm performs up to seven times faster than state-of-the-art erasable pattern mining algorithms on real datasets and scales adeptly on synthetic datasets while delivering reliable and significant result patterns.