# From Bi-Linear Maps to Searchable Encryption

## Pairings-Based Cryptography

### Introduction

Theoretical research into pairings-based cryptography has been a well-researched area over the last few years, this cryptography scheme is based on the mapping of two cryptographical groups which allows for a new cryptographical scheme based on a trapdoor permutation between the groups with some interesting complexity properties.

These two groups are called a Gap Groups in many instances, where the Decisional Diffie-Helman problem is easy, but the Computational Diffie-Helman still is hard to solve. Weil and Tate pairings are used in implementations but requires complex mathematics, this is why in this section we will use a slightly more abstract means to explain bilinear maps..

### Bilinear Maps

Mostly all constructions of pairings-based cryptosystems use bilinear maps, for this we consider two groups $G_1$ and $G_2$ or a prime order $q$. We can denote $G_1$ using additive notation and $G_2$ using multiplicative notation, even as the group operations of $G_1$ and $G_2$ are very different, $G_1$ can also be written as a multiplicative group operation in some literature.

If we consider two generators of group $G_1$ as $\ P$ and $Q$, we can write:

$aP\ =\ P+P+\ ..P\ \}\ a$ times

Using this we can also consider a map $e$ as follows: ${e:\ G}_1\ \times G_1\ \to \ \ G_2$

This type of bilinear map has a main group $G_1$ and a shadow group $G_2$ where we map two group elements in the first group to the second group would need have properties between them in order for it to be useful,  Bilinearity, non-degenerate and computable.

Bilinearity:

For Group $G_1$ using generators $P$ and $\ Q$ we can define a map to $G_2$, where the additive operation in group $G_1$ equals the multiplicative operation in $G_2$:

$\forall P,\ Q\in G_1, \forall a,\ b\ \in {\mathbb{Z}}^*_q,$
$Map:e\left(aP,\ bQ\right)=e(P,\ Q)^{ab}$

If G1 and G2 where both multiplicative groups then the Bilinearity property would be the following:

• $\forall P,\ Q\in G_1,\ \forall a,\ b\ \in {\mathbb{Z}}^*_q,$
• $Map:e\left(P^a,\ Q^b\right)=e(P,\ Q)^{ab}$

This has an interesting property whereby it beaks the decisional Diffie-Helman problem, but this will be discussed in more details later.

Non-Degeneracy:

If all the elements map to the identity of the group then if would not have any additional computational aspects to explore. It is therefore important not to create a map with the identity of either of the groups.
$\forall P\ \in G_1,\ P\ \neq 0\ \ \left\langle e\left(P,\ P\right)\right\rangle =G_2\ (e\left(P,\ P\right)\ generates\ G_2)$
Such that: $P\ \neq 0\ \Rightarrow e\left(P,\ P\right)\neq 1$

Computability: $e$ should be efficiently computable, there are some constructions of maps that are hard to compute.

The construction of these bilinear pairs has been proven by Wei and Tate pairings, where $G_1$ is a typical elliptic curve group, and $G_2$ is a finite field. These have proven to provide complex problems across these groups to construct cryptographical schemes.

### Complex Problems

For the usage of bilinear maps in cryptographical schemes, we define a one-way function using two problems, the Decisional Diffie-Helman problem and the discrete log problem.

Theorem 1: The Discrete Log Problem in $G_1$ is no harder than the Discrete Log Problem in $G_2$.

Proof 1: If we use our additive notation and consider that $Q\ =\ aP$, we then need to solve $a$, which is random, for a given $P$ and a random $Q$
$e\left(P,\ Q\right)=e\left(P,\ aP\right)=e(P,\ P)^a$

With this we can effectively reduce the Discrete Log Problem in $G_1$ to the Discrete Log Problem in $G_2$, if we are given $P\in G_1$ and a random $Q\in G_1$ then the mapping of $e$ is easily computable by calculating ${{log}_P (Q)}$ as:

$P^=e(P,\ P)$
$Q^=e(P,\ Q)$
$a=\ {{log}_{P^} \left(Q^\right)\ in\ \ G_2\ }$
$a={{log}_P (Q)\ }\ in\ G_1$

With this we can see that the difficulty of solving the discrete log problem in both groups are the same, since the computation of ${{log}_P (Q)\ }$ have the same complexity in both groups.

Theorem 2: The Decisional Diffie-Helman is easy in $G_1$.

Proof 2: Solving the Decisional Diffie-Helman problem in $G_1$ requires distinguishing between:

$\langle P,\ aP,\ bP,\ cP \rangle \ with\ a,\ b,\ c\ \in {\mathbb{Z}}^*_q$ and
$\langle P,\ aP,\ bP,\ abP \rangle \in {\mathbb{Z}}^*_q$
If we can define $P,A,B$ and $C$ as the distinguishers four values, then the distinguisher function is as follows:

$v_1=Map:e(A,\ B)\ and\ v_2=Map:e(P,\ C)$

If we have that $v_1=v_2$, then the tuple is of type $\langle P,\ aP,\ bP,\ abP\rangle$

From this we can take $C=\ abP$ from (Theorem 1)

$e(A,\ B)=e(aP,\ bP)$
$=e(P,\ P)^{ab}$
$=e(P,\ abP)$
$=e(P,\ C)$

Since the map $e$ is non-degenerate we can set $e\left(A,\ B\right)=e(P,\ C)$ equivalent to $c\ =ab$. The distinguisher has thus a significant advantage given the mapping $e$ to decide the Decisional Diffie-Helman problem.

Theorem 3 The Bilinear Diffie-Helman Problem is easy in $G_1$ but difficult in $G_2$

Fact: If we are given two groups $G_1$ and $G_2$ with a map between them as $e$, there are no polynomial time algorithm that can compute $\left(P,\ aP,\ bP,\ cP\right)for\ \ some\ a,\ b,\ c\ \in \ {\mathbb{Z}}^*_q$ in $G_1$ given $e(P,\ P)^{abc}$ in $G_2$. With this we can construct the following properties between the groups as the following hard problems:

$e(aP,\ bP)^c\ in\ G_1=e(P,\ P)^{abc}\ in\ G_2$
$e(aP,\ cP)^b\ in\ G_1=e(P,\ P)^{abc}\ in\ G_2$
$e(bP,\ cP)^a\ in\ G_1=e(P,\ P)^{abc}\ in\ G_2$

Using these theories, we can now construct cryptosystems based on these hard problems found in these groups.

### Cryptography Schemes

Using these complexity problems, there has been an abundance of cryptosystems developed over the years, where the two most notable are the 3-party key agreement scheme, identity based encryption and searchable encryption.

#### The 3-party Diffie-Helman key agreement scheme

Joux  introduced in 2000 a three-party key agreement scheme using bilinear maps utilizing the Bilinear Diffie-Helman problem for the construction.

If we have two groups $G_1$ and $G_2$ with $P$ as a generator of $G_1$, and three parties $A,\ B,\ C$ that have respective secrets $a,\ b,\ c\ \in \ {\mathbb{Z}}^*_q$ we can construct a key agreement scheme where each party shares a secret key as follows:
$A\ \longrightarrow B,\ C:aP$
$B\ \longrightarrow A,\ C:bP$
$C\ \longrightarrow A,\ B:cP$

Using the Bilinear Diffie-Helman Problem we can define the following:

$A$ computes $e(bP,\ cP)^a=e(P,\ P)^{abc}$
$B$ computes $e(aP,\ cP)^b=e(P,\ P)^{abc}$
$C$ computes $e(aP,\ bP)^c=e(P,\ P)^{abc}$

All parties now have the same shared key $K=e(P,\ P)^{abc}\ \in G_2$ that can be used as an input to a symmetric encryption scheme.

#### Identity Based Encryption

The idea of using private information, like an email address, as a public key has been long debated and researched, whereby the corresponding private key can be delivered to the rightful owner. The role of the key generator must be to verify the private information before distributing the private key to the owner, although a public key infrastructure would solve this problem, there were substantial research into this area to move away from a trusted third party, and having the identity as part of the encryption.

In Dan Boneh’s and Franklin’s paper an Identity based encryption scheme was created to remove the public key infrastructure with the use of bilinear maps and the bilinear Diffie-Helman problem, incorporating a random oracle model. This protocol consists out of five phases:

Setup

• Defining two groups $G_1$ and $G_2$ with a bilinear map $e:G_{1\ }\times G_1\ \to \ G_2$ and $P$ as a generator
• A System wide secret key $s\in \ {\mathbb{Z}}^*_q$
• A corresponding system wide public key $P_{pub}=sP$, which are not distributed
• Public hash function $H_1:\{0,\ 1{\}}^*\ \to G^*_1$, a random oracle
• Public hash function $H_2:G_2\ \to \{0,\ 1{\}}^n$ for some fixed $n$, the second random oracle.
• The message space $\mathcal{M}=\{0,\ 1{\}}^n$
• The cypher space $C=G^*_1\ \times \{0,\ 1{\}}^n$

To create a private key for a corresponding participant for $ID\in \{0,1{\}}^*$ the system computes:

$Q_{ID}=H_1(ID)$ and
$d_{ID}=\ {sQ}_{ID}$ which is the private key that can be distributes to the user.

Encryption:
If we are now given a message $m\ \in \mathcal{M}$ we can compute the cyphertext as follows:

• $Q_{ID}=H_1\left(ID\right)\in G^*_1$
• We then choose a random $r\in \ {\mathbb{Z}}^*_q$
• We can now compute $g_{ID}=e\left(Q_{ID},\ P_{pub}\right)\in G_2$
• And create the cyphertext: $c=(rP,\ m\oplus H_2(g^r_{ID}))$

Decryption:

When the user receives the cyphertext, he has $c=(u,\ v)\in C$ and can decrypt it using his corresponding private key $d_{ID}$ and $H_2$

$m=v\oplus H_2(e(d_{ID},\ u))$

The main reason that both encryption and decryption works are because of the properties of pairings and the mask generated by $H_2$ that is xor’ed with the plaintext. We can prove the correctness by using simple substitution from the parameters above:

$m=v\oplus H_2(e(d_{ID},\ u))$
$m=v\oplus H_2(e(sH_1(ID)\ ,\ rP))$
$m=v\oplus H_2(e(H_1(ID)\ ,\ P))^{rs}$
$m=v\oplus H_2(e(Q_{ID}\ ,\ sP))^r$
$m=v\oplus H_2(e(Q_{ID}\ ,\ P_{pub}))^r$
$m=v\oplus H_2(g^r_{ID}$)
$m=(m\oplus H_2 (g^r_{ID}))\oplus H_2(g^r_{ID}))$

$=m$

This scheme provides us a way to use the identity as a parameter within the encryption and decryption without the use of a third party. The usage of identity is important, as this can bind the encryption and decryption to a owner of the keys.

## Searchable Encryption

Searchable encryption schemes are a well-studied topic, and there have been several constructions using order revealing and order preserving schemes. For the a simplified construction a protocol I have chosen to use an order revealing encryption schemes based on bilinear maps, this construction  is proven to be secure against adaptively chosen keyword attacks assuming the bilinear Diffie-Helman problem is intractable using the random oracle model.

To use this construction , we will look at the following scenario:

For this scenario, we need to define four entities that will be involved in the scheme:

• Users (1..n): responsible for the creation of messages that are sent to a trusted party for routing. These messages are sent and received via a secure channel to the messaging server.
• Third Party / Message Server: The messaging platform, that routes messages to users, and that can test weather a certain list of keywords are present in the message.
• Legal Authority (1..n): The party interested in searching the message data.
• Trusted Third Party: a Party responsible for securing the private key

Suppose a Legal authority needs to be alerted when certain keywords are transmitted to a messaging server. For example, a user sends a message to another user that he is planning a bombing, the “bombing” needs to create an alert on the messaging server, and the legal authority needs to be sent a encryption of the message thread.

If the messages between the users are encrypted using semantic means, then the messaging server cannot make any alerting decisions as it cannot decrypt the messages. Our goal here is to ensure that the messaging server provide a way to test whether a keyword has been transmitted between the users, without revealing the content of the messages. This can only be achievable by the legal authority providing a list of keywords to the message server that can be used, as well messaging server needs to have access to both the encryption and decryption key of the user’s messages.

To do so, a user encrypts messages between users and messaging server using a standard public key cryptosystem and saves it in his database. The messages are appended with a Public-Key Encryption with keyword Search (PKS) of each keyword. For example, User Steve sends Peter a message $M$ with words $W_1$, $W_2..W_m$, then the trusted messaging server create:
$E_{A_{pub}}(Message)\ \parallel PKS(A_{pub},\ W_1)\parallel \dots \parallel PKS(A_{pub},\ W_m)$
Where $\parallel$ denotes concatenation and $E_{A_{pub}}$ is the public key of the legal authority (Alice). The reason for this form of encryption is so that the legal authority can provide a trapdoor $T_w$ to the messaging server to test whether a certain keyword has been used. Given a searchable encryption for a keyword $PKS(A_{pub},\ \ W^)$ and a trapdoor $T_w$ the messaging server can determine is $W=W^$ , if it’s the case that $W\neq W^$ then the messaging server does not learn any information about the word. It’s also quite interesting to note that this is not a very communitive scheme, as the searchable encryption (PKS) is constructed only using the public key of the legal authority.

### Definitions

Throughout this section we will refer to a negligible function as $f\mathrm{:}\mathrm{\ }\mathbb{R}\to \mathrm{[}0,\ 1]$ where $f\left(s\right)<1/g(s)$ for any polynomial $g$ and sufficiently large $s$. I will start by defining a searchable public key encryption scheme (PKS) where the public key refers to the cyphertext created by the messaging server  using the public key of the legal authority , and the searchable encryption scheme (PKS) does not reveal any information about the message.

Our goal is to enable the legal authority to send a short secret key ($T_w$) for a specific word to the messaging server, so that the messaging server can locate all messages that have this keyword without revealing the word $W$. The secret key ($T_w$) produced by the legal authority is based on the private key, and the messaging server send the message containing the words back to the legal authority, encrypted using the corresponding public key.

Definition 1.1: The following polynomial time randomized algorithms are part of a non-interactive searchable encryption scheme (PKS).

• $KeyGen(s)$: For a security parameter $s$ a corresponding public/private key pair is generated ($A_{priv},\ A_{pub})$ by the legal authority and the public key is sent to the messaging server.
• $PKS(A_{pub},\ W)$: For a word $W$ in the message, a searchable encryption (PKS) is generated using the public key $A_{pub}$ of the legal authority. We will denote the $PKS$ function as $S$
• $Trapdoor(A_{priv},\ W)$: Given the private key of the legal authority, a certain word $W$ produces a trapdoor $T_w$.
• $Test(A_{pub},\ S,\ T_W)$: Given the public key of the legal authority $A_{pub},\ \$and a searchable encryption $S$ on the messaging server, a trapdoor $T_w$ outputs yes’ if $W=W'$

The legal authority will run the $KeyGen$ algorithm and generate its public/ private key pairs, and then use the $Trapdoor$ function to generate a series of trapdoors for words $W_1..\ W_i$ that it wants to search for. The messaging server will then use the $Test$ function to determine whether a given message has a keyword $W_i$.

### Construction

For the definition above I will provide an efficient construction using bilinear maps based on a variant of the Decision Diffie-Hellman assumption with identity based encryption

We will use two groups $G_1,\ G_2$ of prime order $p$ and a bilinear map $e:G_1\times G_1\to G_2$ between the two groups. This map satisfies the following three properties where the size of $G_1$ and $G_2$ is determined by a security parameter:

• Computable: If you are given two elements in $G_1$ as $g,\ h$ then there exists a polynomial time algorithm to compute the map $e(g,\ h)$ in $G_2$
• Bilinear: for all integers in the prime order, we have a map $e(g^x,\ g^y)$ = $e(g,\ g)^{xy}$
• Non-degenerate: if $g$ is a generator of $G_1$ then the map $e(g,\ g)$ is a generator of $G_2$

From this we can build a non-interactive searchable encryption scheme based on bilinear maps. For this we will need two hash function, or random oracles in each group as:

$H_1:\{0,\ 1{\}}^*\to G_1$ and $H_2:G_2\to \{0,\ 1{\}}^{{log p\ }}$

Based on definition 1.1 we will construct the scheme using the same model based on the Dan Boneh Searchable Encryption Scheme:

• $KeyGen(s)$: The security parameter $s$ determines the size of the prime order $p$ of the groups $G_1$and $G_2$. The legal authority then also selects a random $\alpha \in {\mathbb{Z}}^*_p$ and a generator $g$ of $G_1$. The Output is a public key $A_{pub}=[g,\ h=g^{\alpha }]$ and a private key $\alpha$. The public key is then distributed to the messaging server.
• $PKS(A_{pub},\ W)$: Using the public key and a word $W$, the messaging server computes a bilinear map $t\ =e(H_1(W),\ h^r)\in G_2$ using the random oracle and a random $r\in {\mathbb{Z}}^*_p$. Then outputs a searchable encryption $PKS(A_{pub},\ W)=[g^r,\ H_2(t)]$.
• $Trapdoor(A_{priv},\ W)$: The legal authority uses the random oracle and its private key to generate a trapdoor $T_w=H_1(W)^{\alpha }\in G_1$
• $Test(A_{pub},\ S,\ T_W)$: When the messaging server receives a Test function from the legal authority as $S=[A,\ B]$ it can test if $H_2(e(T_w,\ A))=B$

The construction of the scheme can be viewed as a derivative of Identity Based Encryption with a limited number of identities. Using this scheme, the messaging server needs to have the ability to create an index of the words that’s exchanged between the users of the system that can be tested. Unfortunately, this construction has several issues relating to the sharing of the creation of the trapdoor function. None the less, the use of bi-linear maps and hash functions allows us to identify encrypted words without revealing what they actually are.