Tran An Cong1
Ho Lai Thi1
Nguyen Hai Thanh1
-
(College of Information and Communication Technology, Can Tho University, Vietnam
tcan@cit.ctu.edu.vn, lai01633196630@gmail.com, nthai.cit@ctu.edu.vn
)
Copyright © The Institute of Electronics and Information Engineers(IEIE)
Keywords
Information extraction, Text recognition, Vietnamese invoices
1. Introduction
From activities in economies around the world, we can see that the commercial invoice
is a necessary document for enterprise businesses and production activities. Although
the main purpose of a commercial invoice is for payment certification, invoices have
many other uses, such as maintaining records, legal protection, paying taxes, and
for business analytics. Moreover, a commercial invoice is a legal document allowing
the seller to obtain money from the buyer. Therefore, it reveals details of money-related
transactions, such as total price in numbers and words, the price of each item, unit
quantities, currency demanded, etc., and has a full seal and signature to ensure the
payment for service is clear. Invoices can be engraved in hand-written, printed, or
electronic.
Although most countries have deployed e-invoices, some countries (such as Vietnam)
still accept both hand-written and electronic documents. Therefore, requirements for
processing many hand-written and electronic invoices are essential. For a long time,
we have relied on hand-written invoices to process payments, and invoice reconciliation
often takes a long time to process and record everything into a ledger manually or
enter data into software for future retrieval or reference. One of the limitations
in these procedures can be high cost. In addition, such repetitive tasks consume a
lot of time. However, this work can be done better, with less time and effort, by
automating it with information technology.
Numerous countries, including Vietnam, are speeding up the process of digital transformation
and information integration in various fields. For commercial activities, stores design
new systems and want to integrate data produced by older systems. Many companies hire
numerous people to record the data. In recent years, we have witnessed numerous achievements
in many fields through advancements in information technology. Such techniques can
aid the processing of text and can provide quick and efficient information extraction
from invoices. In this study, we leverage a robust, deep learning technique, the graph
convolutional network (GCN), to analyze Vietnamese store invoices, achieving accuracy
reaching 99.5% and an F1-score of 98.52%. This can be promising for practical applications
in Vietnamese stores. Another contribution is a collection of Vietnamese invoices
that can be used in further studies.
The characteristics of our method can be summarized as follows.
· We collected invoices from a chain of G7 stores in Vietnam to perform and analyze
the efficiency of a GCN in extracting information for a case study in Vietnam with
invoices in Vietnamese.
· The proposed method extracted and recognized crucial and essential areas of an invoice,
including the name of the store (in this work, called store title), the address of
the store, the date of issue, total payment due, and other areas.
· We applied two GCNs for comparison: ChebConv and GCNConv. Furthermore, we evaluated
the advantages and disadvantages of the GCNs and compared their efficiency in terms
of running time and accuracy in the recognition tasks.
The remainder of this study is organized as follows. Section 2 outlines the related
work, and Section 3 introduces and describes our proposed Vietnamese invoice information
extraction method. Section 4 reports and discusses the experiments and findings. The
final section concludes this research and presents various perspectives.
2. Related Work
Information extraction automatically obtains necessary information from a document.
In addition, the main extraction step is to classify words (i.e., tagging). The output
is usually stored in the form of key-value pairs.
The task of extracting information from invoices is based on textual information and
the positions of text frames to be calculated and classified so researchers can extract
needed information. In [1], the authors deployed Chargrid [2] and Wordgrid [3] to extract information from scanned documents. A review paper in [4] investigated many methods to perform information extraction tasks, and listed some
of the existing challenges for further research. Finally, a survey in [5] was done to summarize methods for extracting information from unstructured and multidimensional
data.
Recognition and analysis of invoices have been investigated in numerous studies with
various optical character recognition (OCR) approaches. In [6], the authors implemented OCRMiner, extracting and indexing metadata in images scanned
from structured documents with the ORD technique, and the authors in [7] extracted value-added tax (VAT) information from invoices, reaching accuracy of 96.2%.
The work in [8] extracted key information from invoices by using simulated complex scenes, combining
prior knowledge and data augmentation techniques such as adding random noise, color
jitter, horizontal lines, and random rotation. In another study [9], the authors deployed OCR to extract invoice numbers, dates, final payment amounts,
and related descriptions from bills and invoices, exporting and transferring them
to a database for later use. The authors in [10] implemented ZXing code technology and OCR for invoice identification tasks. Finally,
a method of extracting and indexing metadata of (semi-)structured documents was introduced
in [11]. Although only a very few samples were used for the training phase, the method obtained
performance that was comparable to a model trained with a large number of samples.
Several types of research on machine learning have provided interesting results from
invoice analysis. The work in [12] investigated some possibilities of deploying unsupervised outlier detection approaches
to detect potential fraud in invoice data. The authors in [13] implemented Light Gradient Boosting Machine and Random Forest for invoice analysis,
saving the data for deductive analysts. Machine learning techniques were attempted
in [14] to detect anomalies in invoices from Tunisia, leveraging techniques such as multivariate
Gaussian distribution and Light Gradient Boosting Machine. Finally, the authors in
[15] used AlexNet (a famous convolutional neural network) to classify three types of receipts
from hand-written and machine-printed invoices. Another study in [16] deployed a Stacked Propagation Network combined with a Graph Attention Network to
extract key information and data points from invoices and bank statements. The authors
in [17] attempted methods to indicate the invoice information area, and deployed a projection
technique to perform single-character cutting from electronic invoices.
The work in [18] used Support Vector Machines to evaluate risk pre-warnings from invoices, achieving
an accuracy of 97% when classifying three types of risk (denoted A, B, and C by the
authors). The graph convolutional network was tried in [19], recognizing and detecting tables in invoices to get details from the invoices. Ensemble
learning algorithms in [20] analyzed electronic invoices from financial transactions. Machine learning techniques
in [21] provided anomaly detection in electronic invoices. Finally, the authors in [22] used Random Forest to analyze and explore electronic invoices for automobile parts
manufacturers.
3. The Proposed Method
As mentioned in previous sections, extracting information from invoices is based on
textual information and the position of text frames in order to classify them to get
necessary information, with approaches based on templates, natural language processing,
and graphs. The template-based method applies predefined rules to forms and documents,
and has a fixed structure that does not change much. Next, text/keyword matching methods
determine corresponding fields. However, the most significant disadvantage is defining
each rule separately for each form, and not adapting each rule to a new form. The
method with natural language processing techniques starts by converting the image
to text and includes a named object recognition (NER) model for classification into
the corresponding information fields.
The advantage of this method over the template-based approach is the ability to adapt
to new data. However, it does not take advantage of location features, although it
does help identify the respective fields. Specifically, in invoices with information
fields such as total amount, there will be a lot of text about the price, which is
easily confusing. To solve the problems mentioned in the above two methods, a third
graph-based approach can resolve the data classification problem. In this study, we
deployed GCNs to extract information from invoices.
Our overall proposed architecture for Vietnamese invoice information extraction consists
of the steps in Fig. 1. First, from invoice images captured by mobile phones or a scanner, we identify text
frames, recognize text using Character Region Awareness for Text (CRAFT) [23], then build graphs with the two GCN architectures, and then identify features for
nodes. The OCR task is crucial and important for recognizing text areas and identifying
text content. Next, the graph convolutional network builds links between text fields
and content on the invoice. After embedding the features in the graph, we divide the
dataset into two parts: the training set and the testing set from 650 invoices with
19,159 data frames and 81 invoices with 2424 data frames, respectively. During the
testing phase, the pre-processing steps are similar to the training phase, and the
extraction is performed based on the previous stage's results. Details of the techniques
are presented in the following sections. Finally, the model is saved and used in the
training phase.
Fig. 1. The overall architecture for information extraction.
3.1 Character Region Identification for Text Detection
CRAFT [23] tries to define the text frames in the invoice image. The main goal is to localize
individual character areas and associate detected characters with text.
First, CRAFT predicts two points for each character, including area scores, which
indicate the area of \hspace{0pt}\hspace{0pt}the character. Then, it localizes the
character (the area map marks where characters are present). Furthermore, a relationship
score indicates how one character tends to combine with another. The relationship
points merge the characters into one word. The relation map is a symbol for related
characters, with red indicating characters with high relationships that must be merged
into one word, as illustrated in Fig. 2. Finally, we combine the area and relationship scores to give each word a bounding
box. The coordinates of recognized areas are in the following order: upper left, upper
right, lower left, and lower right, where each coordinate owns an (x,y) coordinate
pair of the area, as revealed in Fig. 3.
Fig. 2. The heatmap for (a) identifying text frames; (b) revealing their relationships.
Fig. 3. (a) The identified text frames; (b) their coordinates
3.2 Recognizing Text; Labeling Sections
To recognize Vietnamese words, we leverage the VietOCR method$^{https://github.com/pbcquoc/vietocr,
accessed March 5, 2022}$, which combines Convolutional Neural Networks and Transformer
models to perform recognition tasks. The VietOCR model has good generalizability and
achieves high accuracy on a new dataset. Therefore, we decided to apply the VietOCR
model to the text recognition problem. VietOCR was deployed using the vgg_transformer
model during the training process. First, the text in an invoice can be recognized
as seen in Fig. 4(a), and then, we choose and assign five main areas corresponding to the name of the
store (store title), the address of store (address), the date issued (date), total
payment (total) and anything not in the four named fields (NaN), as seen in Fig. 4(b). Finally, we check for missing labels and add them manually.
Fig. 4. (a) An example of words recognized by VietOCR; (b) marking labels for them.
3.3 Graph Construction
There are many techniques to create graphs from documents. Most of them convert each
text area into a node and use different techniques to construct edges. This way will
create four edges for each node such that the edge connects the nearest text areas
in four directions (top, bottom, left, right).
From a node, if we can draw a vertical or horizontal line to another node, the two
are connected. At each node in each direction, we choose the edge with the shortest
length. For destination nodes with many connected edges, we choose the edge with the
shortest length. Finally, we create a graph by creating a unique edge at each source
node to the destination node (if any), giving preference to horizontal edges. The
steps for processing edges when building the graph are illustrated in Figs. 5 and 6.
The Graph Convolutional Network is a type of Convolutional Neural Network (CNN) that
can operate directly on graphs and that takes advantage of their structural information.
The GCN helps to solve the problem of classifying nodes in a graph. The general idea
of a GCN is that it can receive characteristic information from all its neighbors
and the location for each node. Suppose, using the average function, we do the same
for all nodes. Then, we feed these averages into a neural network. It is possible
to use more complex aggregate functions than the average function in practice, and
to put layers on top of each other for a deeper GCN. Layer output will be considered
input to the next layer. The layer number is the farthest distance a node can cross.
Each node can only get information from neighbors to a GCN layer. The crawling process
takes place independently and simultaneously for all nodes. The crawling process can
repeat when we stack another layer on top of the first. Moreover, the neighbors already
have information about their neighbors (from the previous step).
This study deployed a GCN for invoice analysis to explore local patterns by identifying
locations and text features. Like the CNN, the GCN can indicate local patterns, but
instead of pixel points, it connects nodes with a higher relationship with a node
farther away in the graph. In addition, the GCN can identify location features, which
are information about the position/coordinates of the node in the image that can also
help the model easily distinguish information fields. For example, information about
the name of the supermarket/grocery store is often shown at the top of the bill. Similar
to location features, the GCN’s analysis of textual information is also essential.
For example, we distinguish the address field from other data fields. In addition,
we can stack multiple GCN modules on top of each other, which helps the model learn
high-level features better.
We deployed a GCNConv-based architecture including four traditional graph convolutional
layers (as illustrated in Fig. 7). In contrast, for the ChebConv-based architecture, the proposed architecture includes
four Chebyshev Spectral Graph Convolution layers (with a Chebyshev filter size of
3 for all Chebyshev Spectral Graph Convolution layers). The size of each output after
each graph convolutional layer for both architectures is 64, 32, and 16, respectively,
while the output layer includes five outputs, corresponding to the five labels to
be classified. After the network has been initialized, it computes new features by
passing in the nodes, edges, and weights. Then, the process is repeated in the subsequent
layers with features from output of the class before it. Finally, the ReLU function
activates the output of hidden layers, and in the output, the layer uses the LogSoftMax
function to calculate the log probability for data classification. As a result, the
network receives input with 776 features and provides five outputs, corresponding
to the five areas considered from the invoice.
Fig. 5. (a) The original graph in which all edges are shown; (b) the graph after unnecessary edges are removed; (c) the graph after the edges with the same targeted node are removed.
Fig. 6. The graphs of invoices built based on text frames.
Fig. 7. The proposed GCN architecture.
3.4 Feature Embedding
We build initial properties for graph nodes by aggregating features from many attributes,
including Boolean features, position features, and text features. The Boolean features
check the attributes. For example, we check whether the text is numeric or has special
characters, etc., while the position feature is the relative distance from the current
text frame to the next two frames (horizontally and vertically). The text features
deploy the work in$^{https://metatext.io/models/sentence-transformers-distilbert-base-nli-stsb-mean-tokens}$,
and a Siamese network [24] (a natural language processing model) was implemented by the Transformer library
to compute embedded vectors for sentences in order to obtain a 768-D feature vector.
Finally, the network joins all the attributes and gets a 776-D feature vector (6 +
2 + 768) as the initial feature for each node in the graph.
We deployed and compared two algorithms in training the data: GCNConv [25] and ChebConv [26]. ChebConv is a model that generalizes CNNs to graphs. The main goal is to define
filters that help operate on graphs efficiently. A GCNConv model is an approach based
on an efficient variant of a CNN. The first-order approximation is changed to fit
the graph with the convolution-based architecture.
4. EXPERIMENTS
4.1 Dataset
We experimented with a dataset collected from G7 stores$^{https://shopg7.com/}$ (a
mini supermarket in Vietnam) with invoices captured by mobile phones and saved as
images with a resolution of 1920${\times}$2560. In an experiment using 731 invoices
with 21,583 data frames, five classes were organized: store title, address, date,
total, and other. We extracted 19,159 data frames from 650 invoices for training,
while 81 other invoices with 2424 data frames were used as the test set. The training
dataset through the pre-processing steps before training was stored as a dataset file
(train_data.dataset) with the attributes described in Tables 1 and 2.
We evaluated the accuracy from classification of text frames in the invoices and evaluated
the effectiveness of the two training models in order to choose the optimal model
for extracting information from invoices.
Table 1. Attributes of the dataset.
Attributes
|
Description
|
batch
|
Identifier of each node on the graph
|
edge_index
|
Index of edges
|
img_id
|
Image filename
|
ptr
|
Pointers to the graph of the next invoice
|
text
|
Text corresponding to each node
|
x
|
Feature vector of the nodes in the graph
|
y
|
The label of each node
|
Table 2. Sample distributions according to label.
Label
|
Number of samples
|
Store Title
|
650
|
Address
|
650
|
Date
|
650
|
Total
|
650
|
Other
|
16559
|
4.2 Environment Settings
The experiments were run on a computer with an Intel Core i7-4600U CPU at 2.1GHz and
8GB of RAM under the Windows 10 Pro 64-bit operating system and repeated 10 times.
The results were assessed on the test set with many metrics, including accuracy, and
a confusion matrix averaged the 10 repetitions from the training and test phases.
Both networks used a learning rate of 0.001 and the Adam optimizer [27] running for 2000 epochs.
4.3 The Results on ChebConv
We obtained an average accuracy of 99.81% on the training set and 99.59% on the test
set with ChebConv. In Table 3, we see that Store Title, Address, and Date revealed promising classification results,
and no labels were mistakenly classified, but Total and Other had some errors. From
the results, the ChebConv model applied to the problem of extracting invoice information
is highly suitable.
Table 3. The average confusion matrix for ChebConv in the training and test phases.
|
Store Title
|
Address
|
Date
|
Total
|
Other
|
Average performance on the training set
|
Store Title
|
650.0
|
0.0
|
0.0
|
0.0
|
0.0
|
Address
|
0.0
|
650.0
|
0.0
|
0.0
|
0.0
|
Date
|
0.0
|
0.0
|
650.0
|
0.0
|
0.0
|
Total
|
0.0
|
0.0
|
0.0
|
632.0
|
18.0
|
Other
|
|
|
|
18.0
|
16541.0
|
Average performance on the test set
|
Store Title
|
81.0
|
0.0
|
0.0
|
0.0
|
0.0
|
Address
|
0.0
|
81.0
|
0.0
|
0.0
|
0.0
|
Date
|
0.0
|
0.0
|
81.0
|
0.0
|
0.0
|
Total
|
0.0
|
0.0
|
0.0
|
76.0
|
5.0
|
Other
|
0.0
|
0.0
|
0.0
|
5.0
|
2095.0
|
4.4 The Results from GCNConv
Table 4 shows that all labels had been mistaken when classifying them. In particular, more
than 30 cases with the total label (nearly 40%) were misclassified. However, overall,
the classification results were still relatively good. As a final result, we had an
average accuracy of 96.58% on the training set and 95.26% on the test set. As can
be seen from the graph, the accuracy results on both datasets were pretty good, but
there was an unstable increase or decrease compared to the test set results.
Table 4. The average confusion matrix for GCNConv in the training and test phases.
|
Store Title
|
Address
|
Date
|
Total
|
Other
|
Average performance on the training set
|
Store Title
|
602.0
|
0.0
|
0.0
|
0.0
|
48.0
|
Address
|
0.0
|
624.7
|
0.0
|
0.0
|
25.3
|
Date
|
0.0
|
0.0
|
576.3
|
0.0
|
73.7
|
Total
|
0.0
|
0.0
|
0.0
|
469.8
|
180.2
|
Other
|
48.0
|
25.3
|
73.7
|
180.2
|
16231.8
|
Average performance on the test set
|
Store Title
|
74.3
|
0.0
|
0.0
|
0.0
|
6.7
|
Address
|
0.0
|
76.5
|
0.0
|
0.0
|
4.5
|
Date
|
0.0
|
0.0
|
66.4
|
0.0
|
14.6
|
Total
|
0.0
|
0.0
|
0.0
|
49.3
|
31.7
|
Other
|
6.7
|
4.5
|
14.6
|
31.7
|
2042.5
|
4.5 Comparison of ChebConv and GCNConv
As illustrated in Tables 3 and 4, ChebConv had better accuracy than GCNConv in all classes. However, the inference
time of the ChebConv model was much slower than the GCNConv model, with average execution
times of more than 2 hours 30 minutes, and 16 minutes, respectively (as detailed in
Table 5). Each architecture has its advantages, depending on the problem to be solved. Although
GCNConv produced lower performance, it can provide the results at a speed more than
seven times faster than ChebConv. The disadvantage with ChebConv is that it is not
time efficient and is hard to use in real time.
As shown in Fig. 8, the performance of ChebConv during the training phase was relatively higher than
GCNConv, and with less overfitting. In the beginning, the performances of ChebConv
and GCNConv followed a fairly similar pattern over the first 750 epochs, all remaining
at more than 95% accuracy. After that, however, GCNConv faced overfitting issues where
there was a significant difference between the performance in training and testing.
In addition, GCNConv seemed to converge more slowly than ChebConv during the 2000
epochs. Finally, we can see that both models had relatively high results in general,
so applying a GCN to the problem of extracting invoice information is entirely appropriate.
Based on the results, it is better to build a website to automatically extract invoice
information from G7 stores with a GCN, because the ChebConv model had higher accuracy
than GCNConv, so ChebConv was prioritized for application on the G7 store website.
The website receives the input image of the G7 store's invoice. After being processed,
it will return an image with classified information frames, and the content of the
information to be extracted, as illustrated in Fig. 9.
Fig. 8. The performance from (a) GCNConv; (b) ChebConv in the training and test phases.
Fig. 9. An illustration of the application with a Graph Convolutional Neural Network.
Table 5. Average and standard deviation in a blanket comparison of accuracy and execution times between ChebConv and GCNConv.
|
ChebConv
|
GCNConv
|
Accuracy in training
|
99.81 ($\pm 0.07)$%
|
96.58 ($\pm 0.72)$%
|
Accuracy in testing
|
99.59 ($\pm 0.01)$%
|
95.26($\pm 0.67)$%
|
Execution time
|
157 ($\pm 23)$ mins
|
16.1 ($\pm 2)$ mins
|
5. Conclusion
The results reveal that the information extraction method with a Graph Convolutional
Neural Network on Vietnamese invoices is promising, with an accuracy of 99.5%. A prerequisite
step is building an automatic invoice information extraction system with high accuracy
for multiple invoices. Five areas in the invoices (store title, address, date, total
payment, and other information) were analyzed. These areas cover important information
that can be cross-checked among stores and consumers for future reference or other
requirements. The model was only trained for specialization operation on G7 store
invoices; however, it is expected that the method can be applied to other templates
rather than to various invoices. In addition, we added automatic invoice direction
detection for precise adjustments and scaling to large datasets with invoice diversity.
Data can be collected and extended for further research, most likely to build in spelling
correction from text recognition output.
REFERENCES
Kerroumi M., Sayem O., Shabou A., 2021, VisualWordGrid: Information Extraction from
Scanned Documents Using a Multimodal Approach, in Document Analysis and Recognition
- ICDAR 2021 Workshops, Springer International Publishing, pp. 389-402
Katti A. R., et al. , 2018, Chargrid: Towards Understanding 2D Documents., arXiv
Denk T. I., 2019, Wordgrid: Extending Chargrid with Word-level Information
Joan S. P. F., Valli S., Jan. 2018, A Survey on Text Information Extraction from Born-Digital
and Scene Text Images, Proc. Natl. Acad. Sci. India Sect. Phys. Sci., Vol. 89, No.
1, pp. 77-101
Adnan K., Akbar R., Oct. 2019, An analytical study of information extraction from
unstructured and multidimensional big data, J. Big Data, Vol. 6, No. 1
Ha H. T., Medved’ M., Nevěřilová Z., Horák A., 2018, Recognition of OCR Invoice Metadata
Block Types, in Text, Speech, and Dialogu, Springer International Publishing, pp.
304-312
Zhang J., Ren F., Ni H., Zhang Z., Wang K., Dec. 2019, Research on Information Recognition
of VAT Invoice Based on Computer Vision
Zhi X., Shen Z., Zhao B., Jul. 2021, A Method for Identifying the Key Information
of Electronic Invoicing in Complex Scenes
Kumar P., Revathy S., 2021, An Automated Invoice Handling Method Using OCR, in Data
Intelligence and Cognitive Informatics, Springer Singapore, pp. 243-254
Wang Y., 2022, Intelligent Invoice Identification Technology Based on Zxing Technology,
in Lecture Notes in Electrical Engineering, Springer Nature Singapore, pp. 87-93
Ha H. T., Horák A., Mar. 2022, Information extraction from scanned invoice images
using text analysis and layout features, Signal Process. Image Commun., Vol. 102,
pp. 116601
Hamelers L. H., Jan. 2021, Detecting and explaining potential financial fraud cases
in invoice data with Machine Learning.
Tutica L., Vineel K. S. K., Mishra S., Mishra M. K., Suman S., 2021, Invoice Deduction
Classification Using LGBM Prediction Model, in Lecture Notes in Electrical Engineering,
Springer Singapore, pp. 127-137
Bâra S.-V. Oprea and A., Sep. 2021, Machine learning classification algorithms and
anomaly detection in conventional meters and Tunisian electricity consumption large
datasets, Comput. Electr. Eng., Vol. 94, pp. 107329
Tarawneh A. S., Hassanat A. B., Chetverikov D., Lendak I., Verma C., Apr. 2019, Invoice
Classification Using Deep Features and Machine Learning Techniques
Zhang C., Li B., Edirisinghe E., Smith C., Lowe R., 2022, Extract Data Points from
Invoices with Multi-layer Graph Attention Network and Named Entity Recognition, in
2022 IEEE International Conference on Artificial Intelligence and Computer Applications
(ICAICA), pp. 1-6
Li M., 2022, Smart Accounting Platform Based on Visual Invoice Recognition Algorithm,
in 2022 6th International Conference on Computing Methodologies and Communication
(ICCMC), pp. 1436-1439
Ding N., Zhang X., Zhai Y., Li C., Mar. 2021, Risk assessment of VAT invoice crime
levels of companies based on DFPSVM: a case study in China, Risk Manage., Vol. 23,
No. 1-2, pp. 75-96
Riba P., Dutta A., Goldmann L., Fornes A., Ramos O., Llados J., Sep. 2019, Table Detection
in Invoice Documents by Graph Neural Networks
Bardelli C., Rondinelli A., Vecchio R., Figini S., Nov. 2020, Automatic Electronic
Invoice Classification Using Machine Learning Models, Mach. Learn. Knowl. Extr., Vol.
2, No. 4, pp. 617-629
Tang et al. P., Oct. 2020, Anomaly detection in electronic invoice systems based on
machine learning, Inf. Sci., Vol. 535, pp. 172-186
Hong J., Yeo H., Cho N.-W., Ahn T., Oct. 2018, Identification of Core Suppliers Based
on E-Invoice Data Using Supervised Machine Learning, J. Risk Financ. Manag., Vol.
11, No. 4, pp. 70
Baek Y., Lee B., Han D., Yun S., Lee H., Jun. 2019, Character Region Awareness for
Text Detection
Koch G., Zemel R., Salakhutdinov R., 2015, Siamese Neural Networks for One-shot Image
Recognition, in Proceedings of the 32 nd International Conference on Machine Learning,
pp. 8
Kipf T. N., Welling M., 2017, Semi-Supervised Classification with Graph Convolutional
Networks.
Kumthekar Y. V., 2020, Using ChebConv and B-Spline GNN models for Solving Unit Commitment
and Economic Dispatch in a day ahead Energy Trading Market based on ERCOT Nodal Model
Ba D. P. Kingma and J., 2017, Adam: A Method for Stochastic Optimization.
Author
An Cong Tran (tcan@cit.ctu.edu.vn) is a senior lecturer of the College of Information
and Communication Technology, Can Tho University, Vietnam. He earned his Bachelor
Degree in Computer Science at CTU in 2001 and become a lecturer at CTU from there.
In 2007, he got the Master Degree (Hons) in Computer Science from Asian Institute
of Technology (AIT), Thailand. In 2013, he received the Doctoral Degree in Computer
Science from Massey University, New Zealand. His PhD thesis focuses on a symmetric
parallel approach for class expression learning. His current research interests include
description logic learning, ontology learning, applications of blockchain in public
sector and applications of deep learning methods.
Lai Thi Ho is a fresh graduate student from the College of Information and Communication
Technology, Can Tho University, Can Tho, Viet Nam. Her research interests include
machine learning, deep learning, computer vision and web programming.
Hai Thanh Nguyen is a lecturer of CICT, Can Tho University, Vietnam. He received
his B.S degree in Informatics from Can Tho University, the master degree in Computer
Science and Engineering of National Chiao Tung University, Taiwan, and obtained the
PhD degree in Computer Science from Sorbonne University, France. His current research
includes bioinformatics, health care system, recommen-dation systems, and machine
learning-based applications. Contact him at nthai.cit@ctu.edu.vn.