This chapter explores how to build a blockchain-based data security guarantee system
for logistics. First, research is conducted on the Reputation-based Probability Output
Regression (PoR) consensus algorithm, analyzing its necessity and advantages in ensuring
the secure transmission of blockchain data. Subsequently, a specialized blockchain
data security guarantee system model was constructed based on the characteristics
of the logistics system. This part elaborates on the construction steps and critical
technologies of the model. These studies are expected to provide a new basis for data
security in logistics systems and new perspectives for future technological improvements
and system optimization.
3.1 Research on the PoR Consensus Algorithm based on Reputation Mechanism
Choosing an appropriate consensus algorithm is crucial for building a blockchain-based
data security guarantee system for logistics. Among them, the PoR consensus algorithm
based on the reputation mechanism has gradually attracted attention because of its
unique advantages. The system makes decisions based on the historical behavior and
reputation of the participants, which can prevent malicious attacks and data tampering
to some extent, thereby improving data security [17,18]. Fig. 1 presents the operation mode of the reputation mechanism based on blockchain.
Fig. 1. Operational model of the reputation mechanism based on blockchain.
The system was designed based on the PoR consensus algorithm (Fig. 1), where nodes exchange information through network connections. The nodes identified
by the public key are divided into honest and faulty nodes, constituting a voting
community. Nodes may become faulty because of crashes or a lack of information, while
malicious nodes can control all faulty nodes and coordinate their behavior. The system
assumes that the network is untrustworthy and unreliable, but the failure of one node
will not affect the normal operation of other nodes. N nodes take turns performing
calculations, sending and receiving messages, and performing local calculations. Each
node is identified by a key pair. The evaluation value is generated at the end of
each interaction. Nodes are typically used as receiving or evaluating nodes. The interaction
tuple is expressed as Eq. (1):
where $pk_{i}$ is the public key of the receiving node; $r$ is the rating given by
the evaluation node; $Dski$ is the encrypted transaction data signed by the evaluated
node key. In the consensus stage, the $t$-round transaction is added to the pending
list. Before starting a consensus round, more than 50% of the nodes with the highest
reputation value are selected to form a consensus group $G_{t}$. The member nodes
of the consensus group are expressed as Eq. (2):
where $G_{t}$ is the consensus group. A new leader node $L_{t}$ selected for round
$t$, packages valid transactions to calculate the new reputation value $Reputation$,
and finally broadcasts the submission message to the consensus group. The selection
of the leadership node is expressed as Eq. (3):
where $Block_{t}$ represents the block, and $pk_{L}$ is the public key of the leader
node. The node checks the submission information the leader node sends, including
the Blockt and hash values. The node sends a new submission message after verifying
the public key matching, hash integrity, and transaction validity. The block publication
after verification submission can be written as Eq. (4):
where each node $pk_{i}$ that successfully verifies $Block_{t}$ and $ReputationList$
will send the verification information back to $G_{t}$. The reputation of a node is
based on its historical rating, reflecting the trust of other nodes in it. Generally,
reputation systems rely on the feedback evaluation of nodes and a consideration of
their interaction satisfaction with other nodes in the network. To this end, three
applicable reputation principles are proposed. First, the liquidity of reputation
values, i.e., the calculation of reputation values, is based on the reputation values
of the nodes providing ratings. Second, the time range of reputation values, i.e.,
the contribution of past reputation values to current reputation values is relatively
small. Third, the openness of reputation values means that all community members can
view the reputation values of all nodes. The initial evaluation of node behavior is
expressed as Eq. (5):
where $S_{i}$ represents the reputation of receiving node $i$. Default reputation
values are assigned to all nodes as predictive indicators during system initialization.
During each voting round, nodes can receive ratings from multiple other nodes. The
normalization of $S_{i}$ is expressed as Eq. (6):
where $\left[S_{ij}\right]$ is the defined rating matrix. After each round, these
assessments will be generated for all nodes in the network. The new reputation values
of nodes in the $t$ round are calculated by mixing these ratings with the evaluation
node reputation values in the $t-1$ round. The reputation value of the mixed nodes
is written Eq. (7):
where $\overset{\rightarrow }{S}=\left[S_{ij}\right]$ and $\overset{\rightarrow }{R}=\left[R_{in}\right]$.
The evaluation node provides the evaluation levels. The node mixes the reputation
value of the initial node with the current ranking generated from the rating. The
reputation value of the $t$-round node is expressed as Eq. (8)
where $\alpha $ is a constant determined from system initialization. The value is
set between 0 and 1. It determines which part of the formula has higher priority,
and $P$ is the evaluation. If the value is close to 1, the newly generated reputation
value will give higher priority to rating $P$, while the previously generated reputation
value has lower priority, which conforms to the reputation principle. This helps reduce
the node impact of changing behavior over time [19]. The sigmoid function in Eq. (9) constrains these values and prevents fluctuations in the reputation values:
The PoR consensus algorithm is based on the flowchart of a blockchain logistics system
in Fig. 2.
A compliant logistics node is selected as the proposer responsible for generating
new transaction data blocks (Fig. 2). The proposer verifies the integrity and coherence of the packaged logistics data,
and the passed data is packaged into new data blocks and hash values. The proposer
sends the new data block to other nodes for signature and rights verification, confirming
its legality and sufficient rights. Only the proposer who has passed the verification
can receive their new data blocks from the other nodes and add them to the blockchain.
This process ensures that only logistics nodes with certain rights can participate
in consensus.
Fig. 2. Flowchart of the PoR consensus algorithm in a blockchain-based logistics system.
3.2 Building a Blockchain-based Data Security System Architecture for Logistics
After exploring the PoR consensus algorithm based on the reputation mechanism, the
next focus is on building the architecture of the data security guarantee system for
logistics. The system model will draw on the core idea of the PoR consensus algorithm,
especially the reputation mechanism, to achieve security protection for logistics
data. In logistics systems, the integrity, real-time, and tamper resistance of data
are crucial. Therefore, achieving efficient data processing and transmission while
ensuring data security will be the main challenge of this research. This study used
principal component analysis methods to address this challenge. The research aims
to improve the security and transparency of logistics system data through blockchain
technology, focusing on the important role of principal component analysis in optimizing
data processing and its mathematical foundation. The application of blockchain in
the architecture design of logistics systems provides detailed descriptions of data
storage mechanisms and consensus processes between nodes to ensure the immutability
and authenticity of information. By establishing a functional hierarchy model of the
system, the article clearly describes the responsibilities, data flow paths, and blockchain
operation mechanisms of various entities within the logistics system, providing a
structured and reliable academic explanation for data security assurance solutions.
Principal component analysis aims to achieve the analysis results of "showing more
with less" through dimensionality reduction [20]. Dimensionality reduction is a technique that reduces the number of variables in
a dataset while preserving most of the information in the original dataset. By applying
PCA, a small number of primary components can be extracted from a large number of
variables, which reflect the most important information and structure of the source
dataset. The mathematical expression of principal component analysis can be written
as Eq. (10) by defining a series of linear combinations of raw data variables weighted by eigenvectors
corresponding to the eigenvalues of the data covariance matrix:
where $m$ is the number of samples; $n$ is the evaluation indicator; $a_{i1}$, $a_{i2}$,
$a_{i3}$, $\cdots $$a_{ip}$ ($i$=1,2, $\cdots $, $m$) are the eigenvectors corresponding
to the eigenvalues of the covariance matrix of $X$; $X_{n}$ is the standardized data;
$y_{m}$ is the determined principal component. Using the extreme value standardization
method,
where $j$ is a natural number from 1 to $n$. The matrix for calculating the correlation
coefficient is expressed as Eq. (12):
where $r_{ij}$ is the correlation coefficient between variables $x_{i}$ and $x_{j}$;
$\overline{x}_{i}$ and $\overline{x}_{j}$ are the mean of $x_{i}$ and $x_{j}$. The
method calculates the comprehensive weight of each indicator, as expressed in Eq.
(13):
where $w'_{j}$ is the main component analysis weight; $w''_{j}$ is the entropy weight
method weight; $w_{j}$ is the comprehensive weight of each indicator. The power function
of the order parameters of each subsystem is used to measure the influence of each
order parameter on the system and its contribution to the system. The calculation
program parameter system order is expressed as Eq. (14):
where $w_{j}$ is the weight of the order parameters of each subsystem; $y_{i}\left(c_{i}\right)$
is the degree of order of each subsystem. The blockchain logistics system data security
platform mainly meets the needs of classifying and storing truck and user data on
and off the chain, as well as ensuring the consistency of waybill information. For
the secure storage of logistics information, an "on chain+off chain" approach is adopted
to improve storage efficiency and ensure security. All logistics information is stored
offline. Structured information is stored through IPFS, and important logistics information,
such as waybill information, is stored on the chain. The hash calculation results
of offline storage data are encrypted and uploaded to the blockchain to achieve data
tamper prevention and verification, ensuring data integrity and reliability. The management
module is responsible for user management and query services, while the consensus
module is responsible for block generation, information storage verification, and
inter-node consensus, guaranteeing data accuracy and consistency. Regarding actual
business scenarios, the platform involves trucks, drivers, cargo owners, management,
and cargo sources.
Fig. 3. System scenario diagram.
Fig. 3 presents a schematic diagram of the system scenario, showing the functions and collaborative
work methods of the key entities in the logistics blockchain platform. Trucks are
logistics carriers in the platform, and their transport vehicles are responsible for
cargo handling tasks. As registered users, drivers are responsible for operating trucks
and completing transportation orders the platform assigns. The shipper, also a registered
user, is responsible for publishing the transportation requirements to the platform
and tracking the transportation status of goods in real time. Administrators have
the highest authority to supervise and manage all entities in the system, ensuring
the orderly operation of the platform. The source of goods refers to the facilities
that cooperate with the platform to provide warehousing services for goods. Its location
is an indispensable geographical element in logistics strategic planning, determining
the route and cost of goods transportation. As the chosen underlying blockchain platform,
Super Ledger Fabric’s open source, flexible network architecture, and extensive community
support ensure the customizability and fast iteration ability of the system. The platform
uses the efficiency, security, programmability, and scalability of super ledgers to
meet the needs of logistics data security and efficiency. Fig. 4 shows the architecture of the data security assurance system for logistics.
Fig. 4. Architecture of the data security guarantee system for logistics.
The architecture is divided into five layers: data collection layer, data layer, consensus
and network layer, presentation layer, and user layer. The data collection layer collects
and transmits data through RF devices, information collection terminals, and application
sensors to improve the efficiency of the logistics information platform. The data
layer determines the storage form of logistics information data, with processed information
abstracts stored in the blockchain network and complete data and hash values stored
in a relational database. The consensus and network layer includes P2P networks, authentication
mechanisms, propagation mechanisms, and PoR consensus algorithms. The consensus service
framework of Hyperledger Fabric supports multiple consensus algorithms with high flexibility
and configurability, supports consensus strategies, and improves the flexibility and
applicability of consensus algorithms. The presentation layer utilizes the B/S architecture
and JSP technology to present data. The user layer includes the driver, freight owner,
and freight source. The driver and freight source are responsible for information
entry, and the freight owner and management can query the logistics information of
the goods. Fig. 5 presents the operational mechanism of the blockchain-based data security guarantee
system for a logistics platform.
Fig. 5. Operating mechanism of the blockchain-based data security guarantee system for a logistics platform.
The user registers as a system node to use the function. The system deploys logistics
information traceability supervision and smart contracts on the blockchain to ensure
effective supervision of data information. Data information is classified and stored
in a distributed system and synchronized to the blockchain to ensure data consistency.
Management personnel regularly or according to their permissions, trigger data comparison,
analyze query results, and submit processing requests if there are any abnormalities.
According to the user's identity, the system calls the smart contract to provide a
range of query information. The system aims to provide users with effective logistics
information traceability and regulatory services, ensuring data security and accuracy.
Fig. 6 shows the synchronization process of the regional module.
Fig. 6. Synchronization process of regional modules.
The system utilizes a reputation-based consensus algorithm for consensus module operations,
including establishing consensus groups, leading node elections, and block publishing,
to ensure the storage consistency of logistics information. The administrator module
of the system is responsible for user management and order processing. After new users
join, their authenticity will be confirmed through an authentication mechanism and
historical data will be synchronized from adjacent nodes.