Oblivious Dynamic Searchable Encryption on Distributed Cloud Systems

. Dynamic Searchable Symmetric Encryption (DSSE) allows search/update operations over encrypted data via an encrypted index. However, DSSE has been shown to be vulnerable to statistical inference attacks, which can extract a signiﬁcant amount of information from access patterns on encrypted index and ﬁles. While generic Oblivious Random Access Machine (ORAM) can hide access patterns, it has been shown to be extremely costly to be directly used in DSSE setting. By exploiting the distributed cloud infrastructure, we develop a series of Oblivious Distributed DSSE schemes called ODSE, which enable oblivious access on the encrypted index with a high security and improved e ﬃ ciency over the use of generic ORAM. Speciﬁcally, ODSE schemes are 3 × –57 × faster than applying the state-of-the-art generic ORAMs on encrypted dictionary index in real network settings. One of the proposed ODSE schemes o ﬀ ers desirable security guarantees such as information-theoretic security with robustness against malicious servers. These properties are achieved by exploiting some of the unique characteristics of searchable encryption and encrypted index, which permits us to harness the computation and communication e ﬃ ciency of multi-server PIR and Write-Only ORAM simultaneously. We fully implemented ODSE and have conducted extensive experiments to assess the performance of our proposed schemes in a real cloud environment.


Introduction
Data outsourcing allows a client to store their data on the cloud to reduce data management and maintenance costs.Despite its merits, cloud services come with severe privacy issues.The client may encrypt their data with standard encryption to protect their privacy.However, these techniques also prevent the client from performing basic operations (e.g., search/update) over the outsourced encrypted data.This significantly degrades the benefits of cloud services.In the following, we first outline the current state-of-the-art techniques and their limitations and then, present our methods towards addressing these challenges.

State-of-the-Art and Limitations
Information Leakage in DSSE.The concept of searchable symmetric encryption (SSE) was first proposed by Song et al. [24].This construction can only search on static encrypted data.Curtmola et al. [11] introduced single-keyword-searched SSE with formal security definition, followed by refinements with extended capabilities such as ranked query [27], multi-keyword search [26] or their combinations [7].Dynamic Searchable Symmetric Encryption (DSSE) was introduced by Kamara et al. [17], which offers both search and update on encrypted files F via an encrypted index I representing keyword-file relationships.Many DSSE schemes have been proposed, each offering various performance, functionality and security trade-offs [4] (e.g., [6,9,17,20,29,31]).
It is known that all standard DSSE schemes leak significant information, which are vulnerable to statistical inference analysis [8,16,18,30].There are two sources of information leakages in DSSE: (i) leakages through search and update on encrypted index I, (ii) leakages due to access of encrypted files F. Specifically, since the search and update tokens are deterministic, all DSSE schemes leak access patterns on both I and F. Furthermore, most of them also leak the content of updated files during the update (i.e., forward-privacy) and historical updates (add/delete) on the keyword during the search on I (i.e., backward-privacy).By exploiting these leakages, recent studies have shown that, sensitive information about encrypted queries and files can be recovered [8,18].Zhang et al. [30] has presented file-injection attacks that can determine which keywords have been searched, especially in forward-insecure DSSE schemes.Although some DSSE schemes with improved security (e.g., forward and backward privacy) have been proposed (e.g., [6]), they rely on extremely costly public key operations and still leak access patterns.Liu et al. [18] demonstrated an attack that can determine which keywords have been searched by observing the frequency of search queries (search patterns).Zhang et al. [30] has indicated that, future research on DSSE should focus on sealing information leakages rather than accepting them by default.Unless these leakages are prevented, a trustworthy deployment of DSSE for privacy-critical applications may remain extremely difficult.
Performance Hurdles of the Existing Approaches to Reduce Information Leakages in DSSE.Several attempts (e.g., [5,15]) are either highly costly or unable to completely seal all leakages in DSSE access patterns.Generic Oblivious Random Access Machine (ORAM) [25] 1 can hide access patterns, and therefore, it can prevent most of the information leakages in DSSE.Garg et al. [12] proposed TWORAM scheme, which optimizes the round-trip communication under O(1) client storage when using ORAM to hide file access patterns2 in DSSE.Despite its merits, prior studies (e.g., [9,21]) stated that generic ORAM (e.g., [25]) is still costly to be used in DSSE due to its logarithmic communication overhead.Although several ORAMs with O(1) bandwidth complexity have been introduced recently, they are still very costly due to the use of Homomorphic Encryption (HE).The performance of such schemes has been shown to be worse than O(log N )-bandwidth ORAMs [2].

Our Research Objective and Contributions
It is imperative to seal information leakages from accessing encrypted files F and encrypted index I.Since the size of individual files in F might be arbitrarily large and each search/update query might involve a different number of files, to the best of our knowledge, generic ORAM seems to be the only option for oblivious access on F. The objective of this paper is to design oblivious access techniques on I, which are more efficient than using generic ORAM, by exploiting special properties of searchable encryption and I as elaborated in Fig. 1.Particularly, we identify a suitable data structure for I that allows search and update to operate on separate dimensions.This property permits us to harness communicationefficient techniques such as Write-Only ORAM for update and, by exploiting distributed cloud infrastructure, multi-server PIR for search with low computation overhead.Note that the low communication and computation are important factors in practice since they directly translate into the low end-to-end delay and consequently, improve the quality of services of cloud systems.Notice that the price to pay for such low delay is the collusion vulnerability in the distributed setting, where we assume a limited number of servers that can collude with each other, which is the common adversarial model of multi-server PIR techniques (see Sects. 2 and 4).
We propose a series of Oblivious Distributed Encrypted Index I on the distributed cloud infrastructure with the application on DSSE, which we refer to as ODSE (Fig. 1).We present two ODSE schemes called ODSE wo xor and ODSE wo it , each offering various desirable performance and security properties as follows.
• Low end-to-end delay: ODSE schemes achieve low end-to-end-delay, which are 3×-57×faster than the use of efficient generic ORAMs (e.g., [22,25]) (with optimization [12]) on encrypted index under real network settings (see Sect. 5).It is clear that the standard DSSE constructions (e.g., [9]) are much faster, but also less secure than our proposed methods in the sense of leaking more information beyond the access patterns (e.g., forward-privacy, backward-privacy) over the encrypted index.Compared with standard DSSE where access patterns are leaked by default, ODSE schemes offer higher security by sealing all these leakages at the cost of higher latency.Nevertheless, they are more efficient than using generic ORAM techniques atop the DSSE encrypted index to seal such leakages in some certain cases regarding database and query sizes.We provide the detail analysis in Sect. 5.

Preliminaries and Building Blocks
Notation.We denote F p as a finite field where p is a prime.Operators || and ⊕ denote the concatenation and XOR, respectively.(•) bin denotes the binary representation.u • v denotes the inner product of two vectors u and v. x $ ← S denotes that x is randomly and uniformly selected from set S. Given I as a row/column of a matrix, I[i] denotes accessing i-th component of I. Given a matrix I, I[ * , j . . .j ′ ] denotes accessing columns j to j ′ of I. Let E = (Enc, Dec, Gen) be an IND-CPA symmetric encryption: κ ← E.Gen(1 θ ) generating key with security parameter θ; C ← E.Enc κ (M ) encrypting plaintext M with key κ; M ← E.Dec κ (C) decrypting ciphertext C with key κ.Shamir Secret Sharing (SSS).We present (t, ℓ)-threshold Shamir Secret Sharing (SSS) scheme [23] in Fig. 2. Given a secret α ∈ F p , the dealer generates a random t-degree polynomial f and evaluates f (x i ) for party P l ∈ {P 1 , . . ., P ℓ }, where x l ∈ F p \ {0} is the deterministic identifier of P l .We denote the share for P l as [[α]] l .The secret can be reconstructed by combining at least t + 1 correct shares via Lagrange interpolation.Note that the secret can be recovered from a number of incorrect shares by error correction techniques (discussed in Sect.4).We use this property to improve the robustness of our protocol in malicious settings.
SSS is t-private so that any combinations of t shares leak no information about the secret.SSS offers homomorphic properties including addition, scalar multiplication, and partial multiplication.We extend the notion of share of value to indicate the share of vector: Private Information Retrieval (PIR).PIR enables private retrieval of a data item from a (unencrypted) public database server.We recall two efficient multi-server PIR protocols: (i) XOR-based PIR [10] (Fig. 3) which uses XOR operations and requires each server S l to store b l , a replica of database b containing m blocks (b 1 , . . ., b m ) with the same size; (ii) SSS -based PIR [13] (Fig. 4), which relies on homomorphic properties of SSS, where each server stores b l , a replica of the database b containing m blocks (b 1 , . . ., b m ), where b i ∈ F p .Write-Only ORAM.ORAM allows the user to hide the access patterns when accessing their encrypted data on the cloud.In contrast to generic ORAM where both read and write operations are hidden, Blass et al. [3] proposed a Write-Only ORAM scheme, which only hides the write pattern in the context of hidden volume encryption.Intuitively, 2n memory slots are used to store n blocks, each assigned to a distinct slot and a position map is maintained to keep track of block's location.Given a block to be rewritten, the client reads λ slots chosen uniformly at random and writes the block to a dummy slot among λ slots.Data in all slots are encrypted to hide which slot is updated.By selecting λ sufficiently large (e.g., 80), one can achieve a negligible failure probability, which might occur when all λ slots are non-dummy.It is possible to select a small λ (e.g., 4).In this case, the client maintains a stash component S of size O(log n) to temporarily store blocks that cannot be rewritten when all read slots are full.

The Proposed ODSE Schemes
Intuition.In DSSE, keyword search and file update on I are read-only and write-only operations, respectively.This property permits us to leverage specific bandwidth-efficient oblivious access techniques for each operation such as multiserver PIR (for search) and Write-Only ORAM (for update) rather than using generic ORAM.The second requirement is to identify an appropriate data structure for I so that the above techniques can be adapted.We found that forward index and inverted index are the ideal choices for the file update and keyword search operations, respectively as proposed in [14].However, doing search and update on two isolated indexes can cause an inconsistency, which requires the server to perform synchronization.The synchronization operation leaks significant information [14].To avoid this problem, it is necessary to integrate both search index and update index in an efficient manner.Fortunately, this can be achieved by leveraging a two-dimensional index (i.e., matrix), which allows keyword search and file update to be performed in two separate dimensions without creating any inconsistency at their intersection.This strategy permits us to perform computation-efficient (multi-server) PIR on one dimension, and communication-efficient (Write-Only) ORAM on the other dimension to achieve oblivious search and update, respectively, with a high efficiency.

ODSE Models and Data Structures
System Model.Our model comprises a client and ℓ servers S = (S 1 , . . ., S ℓ ), each storing a version of the encrypted index.In our system, the encrypted files are stored on S ′ , a separate server different from S (as in [15]), which can be obliviously accessed via a generic ORAM (e.g., [25]).In this paper, we only focus on oblivious access of the encrypted index on S.
Threat Model.In our system, the client is trusted and the servers S are untrusted.We consider the servers to be semi-honest, meaning that they follow the protocol faithfully, but can record the protocol transcripts to learn information regarding the client's access pattern.However, our system can be easily extended to deal with malicious servers that attempt to tamper the input data to compromise the correctness and the security of the system (see Sect. 4).We allow upto t < ℓ (privacy parameter) servers among S to be colluding, meaning that they can share their own recorded protocol transcripts with each other.We present the formal security model in Sect. 4.
Data Structures.Assume that the outsourced database can store up to N distinct files and M unique keywords, our index is an incidence matrix I, where each cell I[i, j] ∈ {0, 1} represents the relationship between the keyword at row i and the file at column j.Each keyword and file is assigned to a unique row and column index, respectively.Each row of I represents the search result of a keyword while the content (unique keywords) of a file is represented by a column.Since we use Write-Only ORAM for file update, the number of columns in I are doubled and a stash S is used to store columns of I during the update.Therefore, the size of search index I is M × 2N .
We leverage two static hash tables T w , T f as in [28] to keep track of the location of keywords and files in I, respectively.They have the following structure: T := ⟨key , value⟩, where key is a keyword or file ID and value ← T [key ] is the (row/column) index of key in I. Since there are 2N columns in I while only N files, we denote D as the set of dummy columns that are not assigned to a particular file.

ODSE wo
xor : Fast ODSE We introduce ODSE wo xor that harnesses XOR-based PIR and Write-Only ORAM to achieve low search and update latency.
Search.Intuitively, to search for a keyword w, the client and server execute the XOR-based PIR protocol on the row dimension of I to privately retrieve the row data of w.Since the row is encrypted rather than being public as in the traditional PIR model, the client performs decryption on the retrieved data and filter dummy column indexes to obtain the search result.The detail is as follows.xor offers highly-efficient search and update operations, it has the following security limitations: (i) it can only (at most) detect but cannot recover from malicious servers, which might tamper the data to compromise the privacy and correctness of the protocol.In privacy-critical applications, it is desirable to recover from malicious servers to improve the robustness of the protocol; (ii) the encrypted index and update operations on it are only computationally-secure due to the IND-CPA encryption.
To address the limitations of ODSE wo xor , we introduce ODSE wo it that offers (i) improved robustness against malicious servers with a partial recover capability, and (ii) the highest level of security (i.e., information-theoretic) for both I and operations on it.The main idea is to share the index with SSS, and harness SSS-based PIR to conduct private search.The robustness comes from the ability to recover the secret shared by SSS in the presence of incorrect shares (see Sect. 4).

Setup:
The client first constructs an index I ′ representing keyword-file relationships as in ODSE wo xor .Setup.Instead of encrypting I ′ , the client creates shares of I ′ by SSS.Since SSS operates on elements in F p , each row of I ′ is split into ⌊ log 2 p⌋-bit chunks before SSS computation.So, the index I i is the SSS share of I ′ for server S i , which is a matrix of size M × 2N ′ , where I i [i, j] ∈ F p and N ′ = N/⌊ log 2 p⌋.The detail is as follows.Update: We execute Write-Only ORAM on the column dimension of the encrypted index for the file update.Recall that in ODSE wo xor , λ random columns of the original index I ′ are read to update one column.In ODSE wo it , each column of the index I i on S i contains the share of ⌊ log 2 p⌋ successive columns of I ′ .Therefore, the client reads λ ′ = ⌈ λ ⌊ log 2 p⌋ ⌉ random columns of I i from t + 1 servers to recover λ columns of I ′ before performing update.The detail is as follows.(a) Pick a pair ⟨id, Î⟩ from stash S, and set ← ˆj, and remove ˆj from dummy set D 7. Create SSS for λ ′ column for each j ∈ J , and i = 1 . . ., M: Send ˆIl [ * , j]) to S l for each j ∈ J and l = 1 . . ., ℓ.Each server S l updates its share index as I l [ * , j] ← ˆIl [ * , j] for each j ∈ J

Security
Definition 1 (ODSE security).Let op = (op 1 , . . ., op q ) be an operation sequence over the distributed encrypted index I, where op i ∈ Search(w), Update(f id ) , w is a keyword to be searched and f id is a file with keywords to be updated.Let ODSE j (o) represent the ODSE client's sequence of interactions with server S j , given an operation sequence o.
An ODSE is t-secure if ∀L ⊆ {1, . . ., ℓ} s.t.|L| ≤ t, for any two operation sequences op and op ′ where |op | = |op ′ |, the views {ODSE i∈L (op )} and {ODSE i∈L (op ′ )} observed by a coalition of up to t servers are (perfectly, statistically or computationally) indistinguishable.Remark 1.One might observe that search and update operations in ODSE schemes are performed on rows and columns of the encrypted index, respectively.This access structure might enable the adversary to learn whether the operation is search or update, even though each operation is secure.Therefore, to achieve security as in Definition 1, where the query type should also be hidden, we can invoke both search and update protocols (one of them is the dummy operation) regardless of whether the intended action is search or update.
We argue the security of our proposed schemes as follows.it leverages an SSS-based PIR protocol and therefore, achieves t-privacy for keyword search due to the t-privacy property of SSS [13].(ii) Oblivious Update: The index in ODSE wo it is SSSshared, which is information-theoretically secure in the presence of t colluding servers.ODSE wo it also employs Write-Only ORAM, which offers statistical security due to negligible write failure probability.Therefore in general, the update access pattern of ODSE wo it scheme is information-theoretically (statistically) indistinguishable in the coalition of up to t servers.(iii) ODSE Security: By Remark 1, ODSE wo it performs both search and update protocols regardless of the actual operation.As analyzed above, search is t-private and update pattern is statistically t-indistinguishable.Therefore, ODSE wo it is information-theoretically (statistically) t-secure by Definition 1. ⊓

Malicious Input Tolerance
We have shown that ODSE schemes offer a certain level of collusion-resiliency in the honest-but-curious setting where the server follows the protocol faithfully.
In some privacy-critical applications, it is necessary to achieve data integrity in the malicious environment, where the adversary can tamper the query and data to compromise the correctness and privacy of the protocol.We show that ODSE schemes can be extended to detect and be robust against malicious servers as follows.In ODSE wo xor , we can leverage Message Authentication Code (e.g., HMAC) as presented in [19], where authenticated tag for each row and each column of I is generated.The server will perform operations (i.e., PIR, Write-Only ORAM) on such tags as similar to encrypted index data and send the result to the client.
The client can recover/decrypt the row/column as well as its authenticated tag verify the integrity.
Since ODSE wo it relies on SSS as the building block, we can not only detect but also be robust against malicious server.The main idea is to leverage list decoding algorithm as in [13], given that the Lagrange interpolation in SSS.Recover algorithm does not return a consistent value.Such techniques also allow to determine precisely which server has tampered the data.We refer readers to [13] for detailed description.In general, the list decoding allows t m ≤ t < ℓ − ⌈ √ ℓt⌉ number of incorrect shares of [[α]] (t) .

Configurations
Implementation Details.We implemented all ODSE schemes in C++.Specifically, we used Google Sparsehash to implement hash tables T f and T w .We utilized Intel AES-NI library to implement AES-CTR encryption/decryption in ODSE wo xor .We leveraged Shoup's NTL library for pseudo-random number generator and arithmetic operations over finite field.We used ZeroMQ library for client-server communication.We used multi-threading technique to accelerate PIR computation at the server.Our code is publicly available at https://github.com/thanghoang/ODSEHardware and Network Settings.We used Amazon EC2 with r4.4xlarge instance for server(s), each equipped with 16 vCPUs Intel Xeon @ 2.3 GHz and 122 GB RAM.We used a laptop with Intel Core i5 @ 2.90 GHz and 16 GB RAM as the client.All machines ran Ubuntu 16.04.The client established a network connection with the server via WiFi.We used a real network setting, where the download and upload throughputs are 27 and 5 Mbps, respectively.
Dataset.We used subsets of the Enron dataset to build I containing from millions to billions of keyword-file pairs.The largest database in this study contain around 300,000 files with 320,000 unique keywords.Our tokenization is identical to [21] so that our keyword distribution and query pattern is similar to [21].
Instantiation of Compared Techniques.We compared ODSE with a standard DSSE scheme [9], and the use of generic ORAM atop the DSSE encrypted index.The performance of all schemes was measured under the same setting and in the average-case cost, where each query involves half of the keywords/files in the database.We configured ODSE schemes and their counterparts as follows.
• ODSE: We used two servers for ODSE wo xor and three servers for ODSE wo it scheme.We selected λ = 4 for ODSE wo xor , and λ ′ = 4 with F p where p is a 16-bit prime for ODSE wo it .We note that selecting larger p (up to 64 bits) can reduce the PIR computation time, but also increase the bandwidth overhead.We chose a 16-bit prime field to achieve a balanced computation vs. communication overhead.• Standard DSSE: We selected one of the most efficient DSSE schemes by Cash et al. in [9] (i.e., Π dyn 2lev variant) to showcase the performance gap between ODSE and standard DSSE.We estimated the performance of Π dyn 2lev using the same software/hardware environments and optimizations as ODSE (e.g., parallelization, AES-NI acceleration).Note that we did not use the Java implementation of this scheme available in Clusion library [1] for comparison due to its lack of hardware acceleration support (no AES-NI) and the difference between running environments (Java VM vs. C).Our estimation is conservative in that, we used numbers that would be better than the Clusion library.
• Using generic ORAM atop DSSE encrypted index: We selected non-recursive Path-ORAM [25] and Ring-ORAM [22], rather than recent ORAMs as ODSE counterparts since they are the most efficient generic ORAM schemes to date.Since we focus on encrypted index rather than encrypted files in DSSE, we did not explicitly compare our schemes with TWORAM [12] but instead, used one of their techniques to optimize the performance of using generic ORAM on DSSE encrypted index.Specifically, we applied the selected ORAMs on the dictionary index containing keyword-file pairs as in [21] along with the round-trip optimization as in [12].Note that our estimates are also conservative where memory access delays were excluded, and cryptographic operations were optimized and parallelized to make a fair comparison between the considered schemes.

Overall Results
Figure 5 presents the end-to-end delays of ODSE schemes and their counterparts, where both search and update are performed in ODSE schemes to hide the actual type of operation (see Remark 1).ODSE offers a higher security than standard DSSE at the cost of a longer delay.However, ODSE schemes are 3×-57× faster than the use of generic ORAMs to hide the access patterns.Specifically, with an encrypted index containing ten billions of keyword-file pairs, Π dyn 2lev cost 36 ms and 600 ms to finish a search and update operation, respectively.ODSE wo xor and ODSE wo it took 2.8 s and 7.1 s respectively, to accomplish both keyword search and file update operations, compared with 160 s by using Path-ORAM with the round-trip optimization [12].ODSE wo xor is the most efficient in terms of search, whose delay was less than 1 s.This is due to the fact that ODSE wo xor only requires XOR operations and the size of the search query is minimal (i.e., a binary string).ODSE wo it is more robust (e.g., malicious tolerant) and more secure (e.g., unconditional security) than ODSE wo xor at the cost of higher search delay (i.e., 4 s) due to the larger search query and SSS arithmetic computations.For the file update, ODSE wo it costs 3 s, which is slightly higher than ODSE wo xor (i.e., 2.2 s) since it needs to transmit more data (4 blocks vs. 4 columns) to more servers (3 vs. 2).We further provide a comparison of ODSE schemes with their counterparts in Table 1.We dissected the total cost to investigate which factors contributed the most to the latency of ODSE schemes as follows.

Detailed Cost Analysis
Figure 6 presents the total delays of separate keyword search and file update operations, as well as their detailed costs in ODSE schemes.Note that ODSE performs both search and update (one of them is dummy) to hide the actual type of operation performed by the client.
• Client processing: As shown in Fig. 6  client delay of ODSE schemes can be further reduced (by at least 50%-60%) via pre-computation of some values such as row keys and select queries (only contain shares of 0 or 1).For the file update, the client performs decryption and re-encryption on λ columns (in ODSE wo xor ), or SSS over λ ′ blocks (in ODSE wo it ).Since we used crypto acceleration (i.e., Intel AES-NI) and highly optimized number theory libraries (i.e., NTL), all these computations only contributed to a small fraction of the total delay.• Client-server communication: Data transmission is the dominating factor in the delay of ODSE schemes.The communication cost of ODSE wo xor is smaller than that of other ODSE schemes, since the size of search query and the data transmitted from servers are binary vectors.In ODSE wo it , the size of components in the select vector is 16 bits.The communication overhead of ODSE wo it can be reduced by using a smaller finite field, but at the cost of increased PIR computation on the server side.
• Server processing: The cost of PIR operations in ODSE wo xor is negligible as it uses XOR.The PIR computation of ODSE wo it is reasonable, as it operates on a bunch of 16-bit values.For update operations, the server-side cost is mainly due to memory accesses for column update.ODSE wo it is highly memory accessefficient since we organized the memory layout for column-friendly access.This layout minimizes the memory access delay not only in update but also in search, since the inner product in PIR also accesses contiguous memory blocks by this organization.In ODSE wo xor , we stored the matrix for row-friendly access to permit efficient XOR operations during search.However, this requires file update to access non-contiguous memory blocks.Hence, the file update in ODSE wo xor incurred a higher memory access delay than that of ODSE wo it as shown in Fig. vector of size Ω(N ) and a master key (in ODSE wo xor scheme).Empirically, with the same database size discussed above, the client requires approximately 22 MB in both ODSE schemes.

Experiment with Various Query Sizes
We studied the performance of our schemes and their counterparts in the context of various keyword and file numbers involved in search and update operations that we refer to as "query size".As shown in Fig. 7, ODSE schemes are more efficient than using generic ORAMs when more than 5% of keywords/files in the database are involved in the search/update operations.Since the complexity of ODSE schemes is linear to the number of keywords and files (i.e., O(M + N )), their delay is constant and independent from the query size.The complexity of ORAM approaches is O(r log 2 (N • M )), where r is the query size.Although the bandwidth cost of ODSE schemes is asymptotically linear, their actual delay is much lower than using generic ORAM, whose cost is poly-logarithmic to the total number of keywords/files but linear to the query size.This confirms the results of Naveed et al. in [21] on the performance limitations of generic ORAM and DSSE composition, wherein we used the same dataset for our experiments.

Conclusion
We proposed a new set of Oblivious Distributed DSSE schemes called ODSE, which achieve full obliviousness, hidden size pattern, and low end-to-end delay simultaneously.Specifically, ODSE wo xor achieves the lowest end-to-end delay with the smallest communication overhead among all of its counterparts with the highest resiliency against colluding servers.ODSE wo it achieves the highest level of privacy with information-theoretic security for access patterns and the encrypted index, along with the robustness against malicious servers.Our experiments demonstrated that ODSE schemes are one order of magnitude faster than the most efficient ORAM techniques over DSSE encrypted index.We have released the full implementation of our ODSE schemes for public use and wide adaptation.
6. • Storage overhead: The main limitation of ODSE schemes is the size of encrypted index, whose asymptotic cost is O(N • M ), where N and M are the number of files and unique keywords, respectively.Given the largest database being experimented, the size of our encrypted index is 23 GB.The client storage includes two hash tables of size O(M ) and O(N log N ), the stash of size O(M • log N ), the set of dummy column indexes of size O(N log N ), a counter

3 Fig. 7 .
Fig. 7. Latency of ODSE schemes and their counterparts with different fraction of keywords/files involved in a search/update operation.

•
Full obliviousness with Information-theoretic security: ODSE seals information leakages due to accesses on encrypted index I that lead into statistical attacks such as forward/backward privacy, query types (search/update), hidden size and access patterns.ODSE wo xor and ODSE wo it offer computational and information-theoretic security for I and operations on it, respectively.•Robustnessagainst malicious servers: ODSE wo it can tolerate a certain number of malicious servers in the system.•Full-fledged implementation and open-sourced framework: We fully implemented all the proposed ODSE schemes, and evaluated their performance on real-cloud infrastructure.To the best of our knowledge, we are among the first to open-source an oblivious access framework for DSSE encrypted index that can be publicly used for comparison and wide adaptation (see Sect. 5).

Table 1 .
Comparison of ODSE and its counterparts for oblivious access on I.The encrypted index in ODSE wo it is information-theoretically (IT) secure because it is SSS.Other schemes employ IND-CPA encryption so that their index is computationally secure (see Sect. 4).b All ODSE schemes perform search and update protocols to hide the actual query type.In ODSE wo xor , search is IT-secure due to SSS-based PIR and update is computationally secure due to IND-CPA encryption.Hence, its overall security is computational.c ℓ is # servers.In ODSE wo it , encrypted index and search query are SSS with the same privacy level.Generic ORAM-based solutions have a stronger adversarial model than ours since they are not vulnerable to collusion that arises in the distributed setting. a