SRPC-CS2510

Simple Remote Procedure Call

Keren Ye, Xinyue Huang

1. Introduction

Remote Procedure Call (RPC) is a linguistic approach based on a fundamental concept, procedure call. RPC allows distributed programs to be written in the same conventional style as for centralized computer systems.

In this project, we implemented a Simple Remote Procedure Call Model (SRPC) consisting of a Stub Generator, a Directory Server, a Client Stub, and a Server Stub. We proposed several designs to address related problems of parameter passing, binding, exception handling, call semantics, performance, and data representation.

In the following chapters, we will first give a short introduction of the main components in the SRPC package. Then we will show a schematic diagram of the sequence of events that occur when a remote procedure call is invoked. Implementing details and high level description of the procedure functions will be given later. Finally, we will describe the general format of request / reply messages exchanged among our components.

2. Main components

Our SRPC consists of a Stub Generator, a Directory Server, a Client Stub, and a Server Stub.

2.1 Stub Generator

In a client-server system, the glue that holds everything together is the interface definition, as specified in the Interface Definition Language, or IDL. In our RPC model, an IDL file is configured first. Then, the Stub Generator, an IDL compiler, is invoked to process it and generates codes for both Server Stub and Client Stub.

A simple assumption we made is that the globally unique identifier (program ID) is provided by the user and should be guaranteed that it does not conflict with others. The same assumption of uniqueness is applied in procudure IDs within same program.

2.2 Directory Server

To allow a client to call a server, it is necessary for the server to register on the directory server. Unlike the mechanism in DCE that include two steps (locating the server’s machine and locating the server on that machine), our SRPC has a simpler communication scheme. The Directory Server in our RPC model is responsible for both locating the server’s machine and the end point. Key features of it consist of registering, unregistering, and keeping heartbeat.

2.3 Server Stub / Client Stub

Common head files, codes for both server and client stub are generated after compiling. The client stub and the server stub hide the actual communication between the caller at the client end and the callee at the server end, providing access transparency for the upper level users. Therefore, the users can concentrate on their logic in functions, ignoring details in communicational.

The graph below shows all the exported interface for both directory server and server. Interfaces such as register, unregister are used for updating the lookup table in directory server. Interfaces such as get_insts_by_id and get_svr_id are used for binding client to a server (the client first invokes get_insts_by_id to get server instances, then chooses one to verify version using get_svr_id). Finally, the client will call the remote procedure using call_by_id.

3. Schematic Diagram

The sequence of events between components are shown in the following graph.             

The caller (high level users) first requests a procedure call. Then the client stub receives the procedure call and requests to directory server to look up the list of available servers. Once a server is chosen, the client stub will verify the procedure ID and version number with the server stub. If all of these information match, the client stub will send a RPC request package to the server stub. Server stub will then receive and unpack the package and call the callee (server) to implement the procedure. After implementing the appropriate procedure call, the server stub will pack the result into a package and send it back to the client stub. The client stub will then receive and unpack the response package and return the result back to the caller.

4. Interactions between Components

4.1 Server and Directory Server

Registration of a server makes it possible for a client to locate the server and bind to it. The server calls a function in the SRPC library, register_information(), to register its program ID and version to the directory server.  

After receiving the register request, the directory server stores all the related information indexed by the program ID and version in a hash table. Once the server aborts, it calls unregister_information() in the SRPC library to send a unregister request, notifying withdraw of its own instance in the directory server’s lookup table.

4.1.1 Description of Procedure Functions

Name Description
Function calls in Server register_information Notify directory server to ADD / UPDATE this new server instance.
unregister_information Notify directory server to DELETE this server instance.
Functions calls in Directory Server do_register ADD or UPDATE information for the specific server instance indexed by program id and version.
do_unregister DELETE information for the specific server instance indexed by program id and version.
check_timeout CHECK expiration for all the registered server instances, DELETE the server instance from the lookup table if no heartbeat request is received within a specific amount of time.

4.1.2 Details in Directory Server

The directory server represents the lookup table in its memory as a hash table, in which each entry is a linked-list containing all the server instances within the same program ID and version. An example of the lookup table in the directory server is shown as below.

Each time when register request is received, the directory server checks whether the linked-list for a specific key (generated by concatenating the program id and version) is in the hash table. If so, a sub-search for finding the specific ip and port is applied on the retrieved linked-list. If the specific server instance exsists, the directory server updates the information for this server instance, or it adds a new record indicating the new-coming server instance.

When unregister request is received, the directory server does the reverse process. It first finds the entry for specific program id and version, then finds and removes the specific server instance from the linked-list.

4.1.3 Tracking the Client States in Directory Server (Heartbeat)

From the perspective from the directory server, all the server instances can be treated as clients. Thus the responsibility for directory server becomes tracking the states of all its clients (check whether it is applicable). A hot debate is focused on whether we should process the tracking initiated from the server or from the client. In one way, we can implement a independent thread in the directory server that checks the status for all of its clients in every minute; in the other way, we can urge the client to report its own status to the directory server periodically. We wonder which is a better solution.

Our answer is the latter. The directory server is maintaining information for its clients, however, we can hardly implement it as a stateful server. If the directory server crashes and reboots, it will probably lose all information storing in memory. However, if all clients report their information periodicly, it can help the directory server to reconstruct its lookup table, not instantly, but finally.

Thus, in our solution, the directory server records all clients’ last alive time. It checks this alive time every minute, and DELETE all expired server instances accordingly.

4.2 Client and Server

4.2.1 Binding

For each RPC message, the client stub will call the function get_inst_by_id() to verify the reachability of the servers. The function will return a list of available servers. Among these servers, one of them will be randomly selected and the client stub will call the function get_svr_id() to get the remote service ID and the version number on this server. If the service ID and the version number match, the binding between the client and the remote server is established.  

4.2.2 RPC

After locating the server, the client stub generates a RPC request package containing the procedure’s ID and the procedure name, and marshals the parameters into the request package. The server will implement the corresponding procedure based on the procedure ID and the procedure name. For parameter marshaling, we design some functions to marshal and unmarshal the basic data types such as int, string, float, double, array and matrix. These can be multiplexed by both the server stub and the client stub.

Parameter marshaling has to deal with issues such as little endian and big endian format of machines, and multiple data type representations. In our project, we handle the issue of little endian and big endian format for integer type. The sender will call the function htonl() to copy the integer value into the package and then the receiver will call the function ntohl() to read the integer value from the package. Such method can help receiver read the value correctly.

After marshalling, the client stub will call the function rpc_call_by_id() to transmit the package to the server stub and recieve the return package from the server stub. Then the client will unmarshal the package and restore the value back to the client.

4.3 IDL and Stub Generator

4.3.1 Compound Data Type

1) Problem of Defining Compound Data Type

Designing the stub generator is not easy because dealing with compound data type is challenging. Take the Matrix type as an example, the first function prototype of Matrix Multiplying we proposed is shown as below.

void multiply(int *A, int *B, in *C, int m, int n, int l);

At first glance, it seems a good function prototype because it is easy to describe in an IDL file. We just need to describe six parameters (three pointers to int, and three int parameters) in the xml file. However, when we were writing our stub generator we faced semantic problems: which two parameter should be associated with parameter A indicating its number of rows and columns?  For instance, if the above function prototype is applicable, the users also have the freedom to describe a prototype as below.

void multiply(int *A, int m, int n, int *B, int l, int *C);

It is unlikely that the stub generator can understand the meaning of all these function definitions. Thus, the Matrix is described as an indivisible data type in our design, at least in the IDL file, as well as data types like String and Array. Latter, the stub generator will interpret these compound data types into multiple parameters that can be understood by the machine.

2) Two approaches in Compound Data Type

As long as we designed the IDL regarding compound data type, we should make decision on describing these data types in data structures. We proposed two kinds of approaches:

void clear_matrix(Matrix m);

int m[3][3] = {1,2,3,4,5,6,7,8,9};

Matrix A;

A.row = 3; A.col = 3;

A.data = &m[0][0];

clear_matrix(A);

In this approach, Matrix is a predefined data structure containing three variables, in which ‘row’ and ‘col’ indicate the number of rows and columns, and ‘data’ point to the actual data.

The other approach is:

void clear_matrix(int *m, int m_row, int m_col);

int m[3][3] = {1,2,3,4,5,6,7,8,9};

clear_matrix(&m[0][0], 3, 3);

In this approach, we assume that the users understand the semantics in this kind of function prototype. Although the parameters are loosely related in the definition, they are binding together by the codes generated by the stub generator.

The previous approach seems more strict and has advantage in making the codes concise, yet we choose to implement the second one for mainly two reasons. First, human-being can understand this kind of function prototype thus semantics is not a problem; Secondly, it is not necessary for users to have the pre-knowledge about pre-defined data structure such as Matrix, it is easy for new users to get familiar with the manual of our SRPC.

In summary, our designs for IDL and Stub Generator for dealing with compound data type are:

  1. users can only specify basic data types (such as int, double, char), or compound data types (such as Matrix, Array, and String) in the IDL file (we leave no freedom to users for indicating data type such as a pointer to an integer).
  2. when we recognize in the stub generator that one of the parameter is a compound data type, we would append parameters atomatically to restrict the data type. For example, if one of the parameter is ‘Matrix A’, we will generate a parameter table (int *A, int A_row, int A_col), in which A_row and A_col are atomatically appended.

4.3.2 Semantics in Function Return Value

Similar semantic problem occurs to us in the definition of sort function in the project web page. However, we choose not to implement such prototype because of semantic ambiguity. Sort function is defined in the project descriptions as below.

int *sort(int, size, int *array);

According to us, there are two ambiguious understandings regarding this prototype:

  1. The function can return a pointer to an address that is allocated inside of the function, then the user is responsible to free this buffer after a call
  2. The function should return a pointer pointed to the array

There are many functions defined either of these two ways. An example for the first approach is the system api ‘getaddrinfo’ and ‘freeaddrinfo’. The ‘getaddrinfo’ allocates memory for storing address information, then, ‘freeaddrinfo’ is responsible for destroying the memory block.

An example for the second approach is the function ‘strstr’ in the C standard library. The pointer returned by ‘strstr’ is pointed to a memory address indicated in one of its parameter.

According to the discussions we made, we think it is not suitable of implementing function prototype that returns a pointer pointed to some address, thus in our SRPC, the return type in function is limited to basic data types such as integer, double, and char.

4.4 Linking All Components

For the purpose of making all the components in the project as a whole. We provide a Makefile file for linking all the codes together. The linking order is described as the graph below. What we should note is that ‘linking’ here is a abstract concept consists of both excuting some binary, compiling, actual linking, or just depending on other files.

Take the client stub as an example. The client stub depends on another two modules:

  1. A common library implemented by us. The client stub may invoke functions in this common library encapsulating network communications and other system APIs. The relation between the client stub and common library is ‘actual linking’.
  2. The stub generator which is used for generating codes. The codes for client stub are totally generated from the stub generator, thus the relation is ‘excuting some binary’.

The codes below show the linking steps of the client stub in our main Makefile.

  1. It first checks the dependency on the stub generator, if the stub generator does not exist, it will compile at first.
  2. The binary program of the stub generator will be invoked for generating client stub codes (including .h, .cpp, and a Makefile for compiling the client stub)
  3. Calling the Makefile in the client stub directory will link the common library with the client stub codes, then, the final client stub library and a demo binary will be generated.

# generating client_stub

$(CLIENT_STUB): $(STUB_GENERATOR)

   mkdir -p output/client_stub/include

   mkdir -p output/client_stub/src

   mkdir -p output/client_stub/lib

   ./$(STUB_GENERATOR) -x $(IDLFILE) -t client_stub -p output/client_stub

   mv output/client_stub/*.h output/client_stub/include

   mv output/client_stub/*.cpp output/client_stub/src

   cd output/client_stub/ && make

   @echo -e “$(cchighlight)successfully generating $(CLIENT_STUB)$(ccend)”

5. General Format of Request / Reply

The most important thing in message passing is the format of request / reply, or protocol. There are mainly two kinds of messages in our SRPC. One kind is messages of registering or unregistering request/reply between server and directory server, version verifying request/reply between the client and the server. The other one is the kind of messages that is responsible for holding the marshaling data. In this kind of messages, a procedure id should be identified for distinguishing messages from different remote procedure calls within the same server.

In our design, we make a generalization on both kinds of messages. We force them to share a common header, which indicates the length of the message and indicates all the parameters for general messages. We choose to use the HTTP protocol since it is easy for us to debug the general messages between server and client (all parameters are represented in text). Also, when transfering the marshaling data, we use the http body (sent by POST request) to transport the marshaling data, and use the URI in the HTTP header to transfer auxiliary parameters such as procedure id.

5.1 Directory Server – Register(Heartbeat)/Unregister

This general interface is provided by the directory server. Server instances can utilize the register interface to register the location into the look-up table retained by the directory server. Also, server instances should keep sending register requests to maintain themselves in the look-up table. When the server instances are about to quit, they use the unregister interface to withdraw themselves.

1) Request Url

POST “http://$host:$port/register”

POST “http://$host:$port/unregister”

2) Request Body

<server>

  <id>program_id</id>

  <name>program_name</name>

  <version>program_version</version>

  <ip>instance_ip</ip>

  <port>instance_port</port>

</server>

3) Reply Body

<message>succ</message>

4) Parameters

Parameters Description
Request program_id The global unique program ID.
program_name The name of the program.
program_version The version of the program.
instance_ip Specified the network address of the server instance.
instance_port Specified the end point of the server instance.
Reply message ‘succ’ for successfully registered / unregistered.

5.2 Directory Server – Get Server Instances by ID

More specifically, the semantic for this interface is “get server instances by both ID and version”. When directory server receives this message, it will first search for the key (concatenate ID and version) in its hash table as mentioned in the previous chapter. Then the directory server will return all the alive server instances for the client. Note that this interface may return multiple server instances at once.

1) Request Url

GET “http://$host:$port/get_insts_by_id?id=$program_id&version=$program_version”

2) Reply Body

<service>

  <server>

     <id>program_id</id>

     <name>program_name</name>

     <version>program_version</version>

     <ip>instance_ip</ip>

     <port>instance_port</port>

  </server>

</service>

3) Parameters

Parameters Description
Request program_id The global unique program ID.
program_version The version of the program.
Reply multiple server instances program_id The global unique program ID.
program_name The name of the program.
program_version The version of the program.
instance_ip Specified the network address of the server instance.
instance_port Specified the end point of the server instance.

5.3 Server – Get Server ID (Procedure O)

Also, the semantic for this interface is not just “get server id”, the program id and version are

both returned to the client. In the process of binding, after getting the server’s ip and port from the directory server, the client would get the remote server’s id and version through this interface, then make a judgement whether the id and version matched.

1) Request Url

GET “http://$host:$port/get_svr_id”

2) Reply Body

<server>

  <id>program_id</id>

  <name>program_name</name>

  <version>program_version</version>

</server>

3) Parameters

Parameters Description
Request
Reply program_id The global unique program ID.
program_name The name of the program.
program_version The version of the program.

 

5.4 Server – Call by ID

This interface provided by server is used for the actual remote procedure call. The parameter $procedure_id in the url is used in the server for routing the requests to the correct procedure. The entire http body is the marshalled data.

1) Request Url

POST “http://$host:$port/call_by_id?id=$procedure_id”

2) Request Body

The parameters mashalled by the client stub.

3) Reply Body

The parameters mashalled by the server stub.

4) Parameters

Parameters Description
Request procedure_id The unique id within a program.
request data The marshalled parameters.
Reply reply data The marshalled parameters and return value.

6. Conclusion

In this project, we implemented a Simple Remote Procedure Call Model (SRPC) and implemented the interactions between IDL and stub generator, server and directory server, server and client. To achieve the transparency from the high level user view, we proposed several designs of message passing between the client and the server and addressed the problems related with parameter marshaling, exception handling and data representation.

In sum, we have achieved:

  • Advanced treatment for compound data types such as Array and Matrix.
  • Hiding the differences between physical machines representing data in little endian or big endian format.
  • Heartbeat design to keep in touch between the directory server and server.