2019 AIChE Annual Meeting

(672b) Knowledge-Based Interactive Preprocessing of Process Data

Authors

Liu, Z. - Presenter, Sun Yat-sen University
Hong, M. - Presenter, Sun Yat-sen University
Heng, Y., Sun Yat-sen University
Peng, M. Y., Sun Yat-sen University

Zhong-Yan Liu1,
Min Hong1, Min-Yi Peng1, Yi Heng2,3,4*
,
(1) School of Chemical Engineering and Technology, Sun Yat-sen
University, Zhuhai, China
, (2) School of Data and Computer Science, Sun Yat-sen University,
Guangzhou, China
, (3)
Guangdong Province Key Laboratory of Computational Science, Guangzhou, China
,
(4) National Supercomputer Center in Guangzhou (NSCC-GZ),
Guangzhou, China

* Corresponding author.
E-mail address: hengyi@mail.sysu.edu.cn (Yi Heng)

 

Keyword: Data Preprocessing, Workflow Support, Knowledge Base

 

How to
preprocess data with advanced analysis algorithms and leverage it into
applications is a crucial task in practice from a holistic viewpoint [1]. To
this end, distributed data processing has been proposed to deal with a wide
range of industrial applications [2]. On the one hand, Jashapara et al.
considered well-established information technologies such as databases and data
mining for discovering knowledge [3-5]. On the other hand, the workflow used for processing real-life data can be created
by tools, methods, and standards that provide better support in the
domain of knowledge management, modeling, simulation,
validation and other aspects of designing, testing, and commissioning [6-8].
Meanwhile, the amount of process data obtained from engineering has been
significantly increasing. Personalized and efficient data preprocessing
encountered challenges to develop next-generation intelligent systems, such as maintaining the quality,
accessibility and traceability of data, reviewing and reusing knowledge as well
as providing useful workflows for practical applications. 

 

Figure 1. Overview of the KIDaP management framework

In this
work, taking an application instance of boiling experiments as an example, we
consider a type of management framework that enables knowledge-based big data
preprocessing (KIDaP) for practical tasks in a systematic and efficient way
(see Fig. 1). The KIDaP management framework was proposed by Heng, Theissen and
Soemers [9]. It consists of two main components: a compendium of workflows that
summarizes algorithms and methods for massive data processing and a
knowledge-based interactive open source software tool KIDaP that has direct
access to the compendium. Software user interface is shown in Fig. 2.

Figure. 2. Software user interface of KIDaP

This
knowledge-based management framework is supposed to be used to guide the user
to automate repetitive tasks with workflow support, which is open for future
extensions in other fields of science and engineering. By using the software
KIDaP, users can specify the input format of their data, design and execute a
workflow. The use of different processing tools within the platform is
available while the designed workflow and related single steps can be
automatically recorded so that the data and the actions can be traced easily at
a later time.

The
data preprocessing task considered in this work arises from a pool boiling
experiment conducted by our collaboration partner. It is known that boiling
heat transfer has many applications and can be enhanced by micro/nanostructured
surfaces. In our collaboration partners’ work, wang et al. fabricated
micro/nano bi-porous surfaces
, taking into account both optimal cavity size and
wettability modification [10].
Firstly, a micro/nano bi-porous copper surface with abundant
micro pores whose diameters are fabricated in the optimal range for cavity size
according to previous study from other researchers [11] is considered.
Secondly, a prior development by wettability modification was made to further
enhance the boiling heat transfer on the micro/nano bi-porous copper surface.
Boiling heat transfer performance of the samples fabricated in these two steps
were tested with water at atmosphere pressure respectively [10]. Through the
performance test, original experimental temperature time series data are
obtained. They manually processed these massive data step by step in order to
obtain the boiling performance curve. According to the data analysis results,
the mechanism of the
boiling heat transfer can be further analyzed.

Here,
we are going to use the proposed knowledge-based
management framework
to preprocess the original data. As shown in Fig. 3,
the data processing work-process is divided into five tasks, of which task 4
can be optionally selected. Each task applies a different algorithm or a single
step, so that it can process data efficiently. At first, we import the original
data (*.CSV) into the KIDaP tool and perform task 1 (prepare data structure) by
using the KIDaP built-in functionalities. Then, task 2 (data screening) is
executed with an equal-width random selection. In task 3 (data
transformation), a series of numerical algorithms (such as Fourier 1-D
conduction equation) are used to analyze the data. Task 4 (data integration) is
optional, and will be performed when necessary. And then, task 5 (data
visualization) will be executed as a final step. The drawing module is called
to plot the data, for further analysis and study of the boiling heat transfer
mechanism. Compared with manual processing of data, the proposed
knowledge-based management framework is supposed to solve the problems of
excessive data volume, numerous data processing steps, efficiently process and
record the workflows as well as implement parallel parametric sweep, etc. As a
long-term goal
,
the proposed knowledge-based management framework is supposed to provide a
universal solution strategy for this kind of experimental data processing
problems
in the future.

Fig.3.
Suggested workflow based on the KIDaP management framework for the example
arising from pool boiling experiment

In conclusion, the proposed data
preprocessing approach is supposed to provide workflow support and guidance for
the experimenters in a more convenient and flexible way. In case of complex
work-processes and huge amount of industrial data, the in-depth research of the
software framework KIDaP is also supposed to meet the industrial requirement
and move one step forward to the development of future intelligent data
preprocessing systems, which will be focused on the method of deep learning,
the support of the docking of multiple platforms and the basis of the
foundation for solving complex sequential as well as parallel problems. In
future work, more industrial big data-preprocessing problems arising from
different engineering fields are suggested to be tackled by KIDaP based on the
proposed management framework. Furthermore, KIDaP will be continuously extended
to support intelligent data preprocessing with workflow support so as to satisfy
the practical requirement.

 

Acknowledgments

The authors particularly thank
our collaboration partners Dr. Marcus Soemers and Manfred Theissen for
providing the open source software KIDaP and the corresponding work-process
compendium. Dr. Ya-qiao Wang and Hao-ran Lu are appreciated for giving the
measurement data from their pool boiling experiments.

 

References

[1]    G.F., Wang., X.T., Tian., J.H., Geng., et., al.
A Process Innovation Knowledge Management Framework and its Application.
Advanced Materials Research, 655-657 (2013) 2299-2306.

[2]    P., Gaj., A., Malinowski., T., Sauter., et., al.
Guest editorial: Distributed Data Processing in Industrial Applications. IEEE
Transactions on Industrial Informatics, 11(3) (2015) 737-740.

[3]    A., Jashapara. Knowledge Management, An
Integrated Approach. Prentice Hall, (2004) 73-108.

[4]    E., Žižmond., M., Novak. Controversies of
Technology Convergence within The European Union. Industrial Management &
Data Systems, 107(5) (2007) 618-635.

[5]    S., Natek., D., Lesjak. Improving Knowledge
Management by Integrating Hei Process and Data Models. Journal of Computer
Information Systems, 53(4) (2015) 81-86.

[6]    Z.P., Zhang., S.M., Jasimuddin. Knowledge Market
in Organization: Incentive Alignment and IT Support. Industrial Management
& Data Systems, 112(7) (2012) 1101-1122.

[7]    Y., Wang., S., Yu., T., Xu. A User Requirement
Driven Framework for Collaborative Design Knowledge Management. Advanced
Engineering Informatics, 33 (2017) 16-28.

[8]    D., Mishra., S., Aydin., A., Mishra., et., al.
Knowledge Management in Requirement Elicitation: Situational Methods View.
Computer Standards & Interfaces, 56 (2018) 49-61.

[9]    Y., Heng., M., Soemers., M., Theißen.
Wissensbasierte Aufbereitung von Prozessdaten mit Workflow-Support.
Mitteldeutsche Mitteilungen, 1 (2012) 8-9.

[10]  Y.Q., Wang., J.L., Luo., D.C., Mo., et., al.
Wettability Modification to further Enhance the Pool Boiling Performance of the
Micro Nano Bi-porous Copper Surface Structure. International Journal of Heat
and Mass Transfer, 119 (2018) 333-342.

[11]   C., Li., Z., Wang., P.I., Wang., et., al.
Nanostructured Copper Interfaces for Enhanced Boiling. Small, 4(8) (2008)
1084-1088.