• Big Data Platform

    Big data technology is widely adopted across many disciplines. Big data is complex, volatile, lack of correlation, and value scarce by nature, which makes it difficult to form standardized and systematic technological solutions, to address the diversified requirements for life cycle management of big data in different application domain. In order to build sustainable big data application systems, encourage its rapid development and delivery of expected values with minimum efforts, we development innovative engineering technology and integrated platform for big data applications. Major challenges to be addressed include: data life cycle management covering: data collection, storage, computation, analysis, visualization, as well as the software systems engineering life cycle.

  • Event Data Management

    Event data management focuses on the management, mining and analysis of massive amount of event data. Major data types include temporal data, graph data processing algorithms, system design and applications. Key technologies include: event data storage, feature identification, feature indexing, efficient searching, etc. We have published 80 + papers in this area at top level conferences and journals, e.g., SIGMOD, VLDB, ICDE, IEEE TSC, WWW, Computers in Industry, DMKD, JIIS, SoSyM、Information Processing Letters, referenced for 1000+ times from Google Scholar.

  • Media Data Management

    Media Data search and analysis focuses on the area of multimedia information retrieval and management, in particular, visual object classification, automatic semantic annotation, content-based multimedia indexing, social multimedia retrieval, mining and recommendation. The media group has published more than 50 research papers in international conferences and journals (CVPR, ICCV, SIGIR, ICML, AAAI, IEEE TKDE, ACM TIST, CVIU, MTAP) and applied for 8 Patent Rights in China.

  • Web Data Mining

    Machine learning is a well-recognized area, which is to design a computational process of acquiring new knowledge or skills, and optimizing system performance by getting inspirations from human behaviour. The web data mining group’s research interests include machine learning and data mining algorithms and techniques of unstructured web data, social networks, graph data, stream data. The group has published over 100 papers on relevant conferences and journals, including TKDE、PVLDB、KDD、SIGIR、IJCAI、AAAI、ACM Multimedia, KDD, IJCAI, WSDM, CIKM, ADKDD, etc.

  • Industrial Big Data

    Today’s industrial processes are tightly coupled with information technology, the product data accumulated daily has far beyond the processing ability of conventional data processing approaches. Industrial Big Data group aims to provide key enabling technology for enterprises to build new business services using accumulated data, in particular, the data storage, process, mining and generation of added value to make the transformation of traditional industrial organizations into advanced high-end manufacturing and service enterprises. In due process, the data objects are mostly sensor data produced by construction machinery. Key techniques for big data management and mining are adopted to gain patterns, experiences, insights and value from data.

  • Medical Big Data

    The recent rapid development of healthcare information technology leads to the accumulation of hospital operational data and patient clinical records. By using knowledge management and big data analytics techniques, we aims to identify general patterns and gain insights from the available data resources, so that value-added services can be provided to patients health management, medical practitioners’ clinical decision making, and healthcare administrator’s clinical quality measurements.

  1. Mingsheng Long, Jianmin Wang, Yue Cao. Learning Transferable Features with Deep Adaptation Networks. ArXiv 2015. (To Appear)
  2. Mingsheng Long, Jianmin Wang, Jiaguang Sun, Philip S. Yu. Domain Invariant Transfer Kernel Learning. IEEE Transactions on Knowledge and Data Engineering, TKDE 99: 1-14 (2015)
  3. Mingsheng Long, Jianmin Wang, Guiguang Ding, et al. Transfer Learning with Graph Co-Regularization. IEEE Transactions on Knowledge and Data Engineering, TKDE 26(7): 1805-1818 (2014)
  4. Mingsheng Long, Jianmin Wang, Guiguang Ding, et al. Adaptation Regularization: A General Framework for Transfer Learning. IEEE Transactions on Knowledge and Data Engineering, TKDE 26(5): 1076-1089 (2014)
  5. Xiangdong Huang, Jianmin Wang, Jian Bai, et al. Inherent Replica Inconsistency in Cassandra. IEEE International Conference on Big Data, BigData 2014: 740-747
  6. Yuqing Zhu, Philip S. Yu, Jianmin Wang. RECODS: Replica consistency-on-demand store. IEEE International Conference on Data Engineering, ICDE 2013: 1360-1363
  7. Yuqing Zhu, Philip S. Yu, Jianmin Wang. Latency Bounding by Trading off Consistency in NoSQL Store: A Staging and Stepwise Approach. CoRR abs/1212.1046 (2012)
  8. Yuqing Zhu, Jianmin Wang. Client-centric consistency formalization and verification for system with large-scale distributed data storage. Future Generation Comp. Syst. 26(8): 1180-1188 (2010)
  1. Jianmin Wang, Shaoxu Song, Xuemin Lin, Xiaochen Zhu, Jian Pei. Cleaning Structured Event Logs: A Graph Repair Approach. IEEE International Conference on Data Engineering, ICDE 2015
  2. Shaoxu Song, Aoqian Zhang, Jianmin Wang, Philip S. Yu. SCREEN: Stream Data Cleaning under Speed Constraints. ACM SIGMOD International Conference on Management of Data, SIGMOD 2015
  3. Xiaochen Zhu, Shaoxu Song, Xiang Lian, Jianmin Wang, Lei Zou. Matching Heterogeneous Event Data. ACM SIGMOD International Conference on Management of Data, SIGMOD 2014: 1211-1222
  4. Xiaochen Zhu, Shaoxu Song, Jianmin Wang, Philip S. Yu, Jiaguang Sun. Matching Heterogeneous Events with Patterns. IEEE International Conference on Data Engineering, ICDE 2014: 376-387
  5. Tao Jin, Jianmin Wang, Yun Yang, Lijie Wen, Keqin Li. Refactor Business Process Models with Maximized Parallelism. IEEE Transactions on Services Computing, 2014
  6. Tao Jin, Jianmin Wang, Lijie Wen, Gen Zou. Computing Refined Ordering Relations with Uncertainty for Acyclic Process Models. IEEE Transactions on Services Computing, 2014
  7. Jianmin Wang, Tao Jin, Raymond K. Wong, Lijie Wen. Querying business process model repositories - A survey of current approaches and issues. World Wide Web, 2014
  8. Hedong Yang, Lijie Wen, Jianmin Wang, Raymond K. Wong. CPL+: An improved approach for evaluating the local completeness of event logs. Information Processing Letters, 2014
  9. Jianmin Wang, Shaoxu Song, Xiaochen Zhu, Xuemin Lin. Efficient Recovery of Missing Events. Proceedings of the VLDB Endowment, PVLDB 6(10): 841-852 (2013)
  10. Tao Jin, Jianmin Wang, Marcello La Rosa, Arthur H. M. ter Hofstede, Lijie Wen. Efficient querying of large process model repositories. Computers in Industry, 2013
  11. Jianmin Wang, Raymond K. Wong, Jianwei Ding, Qinlong Guo, Lijie Wen. Efficient Selection of Process Mining Algorithms. IEEE Transactions on Services Computing, 2013
  12. Liang Song, Jianmin Wang, Lijie Wen, Hui Kong. Efficient Semantics-Based Compliance Checking Using LTL Formulae and Unfolding. Journal of Applied Mathematics, 2013
  13. Zhaoxia Wang, Jianmin Wang, Xiaochen Zhu, Lijie Wen. Verification of workflow nets with transition conditions. Journal of Zhejiang University - Science C, 2012
  14. Haiping Zha, Wil M. P. van der Aalst, Jianmin Wang, Lijie Wen, Jiaguang Sun. Verifying workflow processes: a transformation-based approach. Software and System Modeling, 2011
  15. Haiping Zha, Jianmin Wang, Lijie Wen, Chaokun Wang, Jiaguang Sun. A workflow net similarity measure based on transition adjacency relations. Computers in Industry, 2010
  16. Lijie Wen, Jianmin Wang, Wil M. P. van der Aalst, Biqing Huang, Jiaguang Sun. Mining process models with prime invisible tasks. Data & Knowledge Engineering, 2010
  17. Lijie Wen, Jianmin Wang, Wil M. P. van der Aalst, Biqing Huang, Jiaguang Sun. A novel approach for process mining based on event types. Journal of Intelligent Information Systems, 2009
  18. Lijie Wen, Wil M. P. van der Aalst, Jianmin Wang, Jiaguang Sun. Mining process models with non-free-choice constructs. Data Mining and Knowledge Discovery, 2007
  1. Jile Zhou, Guiguang Ding, Yuchen Guo, Qiang Liu, XinPeng Dong, Kernel-Based Supervised Hashing for Cross-View Similarity Search, ICME 2014
  2. Zijia Lin, Guiguang Ding, Mingqing Hu, Jianmin Wang, Multi-label Classification via Feature-aware Implicit Label Space Encoding, ICML 2014
  3. Zijia Lin, Guiguang Ding, Mingqing Hu, Yunzhen Lin, Shuzhi Sam Ge, Image Tag Completion via Dual-view Linear Sparse Reconstructions, CVIU 2014
  4. Jile Zhou, Guiguang Ding, Yuchen Guo, Latent Semantic Sparse Hashing for Cross-Modal Similarity Search, SIGIR 2014.
  5. Mingsheng Long, Jianmin Wang, Guiguang Ding, Philip Yu, Transfer Joint Matching for Visual Domain Adaptation, IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2014).
  6. Guiguang Ding, Yuchen Guo, Jile Zhou, Collective Matrix Factorization Hashing for Multimodal Data, IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2014).
  7. Zijia Lin, Guiguang Ding, Mingqing Hu, Image Auto-annotation via Tag-dependent Random Search over Range-constrained Visual Neighbours, Multimedia Tools and Applications (2014).
  8. Zijia Lin, Guiguang Ding, Mingqing Hu, Multi-source Image Auto-annotation, ICIP 2013: 2567-2571 (Oral, Top 10% Paper)
  9. Z. Lin, G. Ding, M. Hu, et al. Image Tag Completion via Image-Specific Linear Sparse Reconstructions, IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013).
  10. M.Long, G. Ding, J. Wang, Philip Yu, Transfer Sparse Coding for Robust Image Representation, IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013).
  11. J Shi, M Long, Q Liu, G Ding, J Wang, Twin Bridge Transfer Learning for Sparse Collaborative Filtering, Advances in Knowledge Discovery and Data Mining, 496-507
  12. W Zhang, G Ding, L Chen, C Li, C Zhang, Generating virtual ratings from chinese reviews to augment online recommendations, ACM Transactions on Intelligent Systems and Technology (TIST) 4 (1), 9.
  13. Z. Lin, G. Ding, M. Hu, J. Wang, J. Sun, Automatic image annotation using tag-related random search over visual neighbors, In Proceedings of the 21st ACM international conference on Information and knowledge management, CIKM '12.
  14. M. Long, J. Wang, G. Ding, D.Shen, Q. Yang, Transfer Learning with Graph Co-Regularization, In Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, AAAI '12, pp.1033-1039.
  15. M Long, J Wang, G Ding, W Cheng, X Zhang, W Wang, Dual transfer learning, 12th SIAM International Conference on Data Mining (SDM 2012).
  1. C. Wan, X. Jin, G. Ding, D. Shen. Gaussian Cardinality Restricted Boltzmann Machines. Proc. 29th AAAI Conf. on Artificial Intelligence (AAAI). 2015.
  2. Jun Zhang, Chaokun Wang, Jianmin Wang, and Jeffrey Xu Yu. Inferring Continuous Dynamic Social Influence and Personal Preference for Temporal Behavior Prediction. PVLDB 2014
  3. Jun Zhang, Chaokun Wang, and Jianmin Wang. Who Proposed the Relationship? --- Recovering the Hidden Directions of Undirected Social Networks. WWW 2014
  4. Jun Chen, Chaokun Wang, and Jianmin Wang. Modeling the Interest-Forgetting Curve for Music Recommendation. ACM Multimedia 2014
  5. Jun Zhang, Chaokun Wang, and Jianmin Wang. Learning Temporal Dynamics of Behavior Propagation in Social Networks. AAAI 2014
  6. Jun Chen, Chaokun Wang, Lei Yang, Qingfu Wen, and Xu Wang. MiSCon: A Hot Plugging Tool for Real-time Motion-based System Control. ACM Multimedia 2014
  7. Raymond Y. K. Lau, Chunping Li, Stephen S. Y. Liao: Social analytics: Learning fuzzy product ontologies for aspect-oriented sentiment analysis. Decision Support Systems 65: 80-94, 2014
  8. Wenping Zhang, Raymond Y.K. Lau, Chunping Li, Adaptive Big Data Analytics for Deceptive Review Detection in Online Social Media, Proceedings of International Conference on Information System(ICIS), 2014
  9. W. Cheng, X. Jin, J. Sun, X. Lin, X. Zhang, W. Wang. Searching Dimension Incomplete Databases. IEEE Transactions on Knowledge and Data Engineering (TKDE). 26(3): 725-738, 2014.
  10. Jun Zhang, Chaokun Wang, Yuanchi Ning, Yichi Liu, Jianmin Wang, and Philip Yu. LaFT-Explorer: Inferring, Visualizing and Predicting How Your Social Network Expands. ACM SIGKDD 2013.
  11. Yiyuan Bai, Chaokun Wang, Yuanchi Ning, Hanzhao Wu, and Hao Wang. G-Path: Flexible Path Pattern Query on Large Graphs. WWW 2013.
  12. Jun Zhang, Chaokun Wang, Philip Yu, and Jianmin Wang. Learning Latent Friendship Propagation Networks with Interest Awareness for Link Prediction. ACM SIGIR 2013.
  13. X. Ding, X. Jin, Y. Li, L. Li. Celebrity Recommendation with Collaborative Social Topic Regression. Proc. 23th Intl. Joint Conf. on Artificial Intelligence (IJCAI). 2013.
  14. Tong Zhao, Chunping Li, Mengya Li, Social Recommendation Incorporating Topic Mining and Social Trust Analysis, In Proceedings of ACM CIKM 2013: 1643-1648
  15. Yajie Miao, Chunping Li, Jie Tang, Lili Zhao: Identifying new categories in community question answering archives: a topic modeling approach. In Proceedings of ACM CIKM 2010: 1673-1676
  16. Ying Liu, Hui Zhang, Chunping Li, Roger Jianxin Jiao: Workflow simulation for operational decision support using event graph through process mining. Decision Support Systems 52(3): 685-697 (2012)
  17. L. Li, X. Jin, S. Pan, J. Sun. Multi-Domain Active Learning for Text Classification. Proc. 18th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining (KDD). 2012.
  18. L. Li, X. Jin, M. Long. Topic Correlation Analysis for Cross-Domain Text Classification. Proc. 26th AAAI Conf. on Artificial Intelligence (AAAI). 2012.
  19. X. Wang, X. Jin, M. Chen, K. Zhang, D. Shen. Topic mining over asynchronous text sequences. IEEE Transactions on Knowledge and Data Engineering (TKDE), 24(1), 2012.
  20. M. Chen, X. Jin, D. shen. Short Text Classification Improved by Learning Multi-Granularity Topics. Proc. 22th Intl. Joint Conf. on Artificial Intelligence (IJCAI). 2011.
  21. Zhang Liu, Chaokun Wang, Yiyuan Bai, Hao Wang, and Jianmin Wang. MUSIZ: A Generic Framework for Music Resizing with Stretching and Cropping. ACM Multimedia 2011.
  22. Yajie Miao, Chunping Li, Jie Tang, Lili Zhao: Identifying new categories in community question answering archives: a topic modeling approach. In Proceedings of ACM CIKM 2010: 1673-1676
  23. Peng Zou, Chaokun Wang, Zhang Liu, Jianmin Wang, and Jia-Guang Sun. A Cloud based SIM DRM Scheme for the Mobile Internet. ACM CCS 2010.
  24. Chaokun Wang, Jianmin Wang, Xuemin Lin, Wei Wang, Haixun Wang, Hongsong Li, Wanpeng Tian, Jun Xu, and Rui Li. MapDupReducer: Detecting Near Duplicates over Massive Datasets ACM SIGMOD 2010.
  25. Zhang Liu, Chaokun Wang, Jianmin Wang, Wei Zheng, and Shengfei Shi. Structure-Aware Music Resizing Using Lyrics. WWW 2010.
  26. Y. Zhang, X. Jin. Concept Sampling: Towards Systematic Selection in Large-Scale Mixed Concepts in Machine Learning. Proc. 20th Intl. Joint Conf. on Artificial Intelligence (IJCAI). 2007.
  27. X. Jin, X. Zuo, K. Lam, J. Wang, J. Sun. Efficient Discovery of Emerging Frequent Patterns in Arbitrary Windows on Data Streams. Proc. 22nd Intl. Conf. on Data Engineering (ICDE). 2006.
  28. X. Jin, Y. Lu, C. Shi. Similarity Measure Based on Partial Information of Time series. Proc. 8th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining (KDD). 2002.